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1  Introduction 


Transforms  associated  with  classical  special  functions  occur  in  a  wide  variety  of  fields.  Examples  of 
such  transforms  are  Fourier-Bessel  (Hankel)  transforms,  orthogonal  polynomial  transforms,  Fourier 
transforms,  etc.  They  are  encountered  in  geophysical  modeling,  submarine  warfare,  quantum  me¬ 
chanical  calculations,  simulation  of  radar  phenomena,  etc.  For  the  purposes  of  this  paper,  we  define 
special  functions  as  the  eigenfunctions  on  some  (finite  or  infinite)  interval  [a,  b]  of  the  differential 
equation 

Jr  6^)  +  +  Au;(x))  <p  =  0,  (1.1) 

subject  to  appropriate  boundary  conditions  at  the  points  a  and  b.  The  class  of  special  functions  in 
question  is  defined  by  the  interval  [a,  b],  the  functions  w,p,r  :  [a,  b]  — >  R,  and  the  boundary  condi¬ 
tions;  the  function  w  is  assumed  to  be  strictly  positive.  We  will  be  denoting  the  nth  eigenfunction 
of  (1.1)  by  pn,  and  the  corresponding  eigenvalue  by  Xn.  As  is  well-known  (for  example,  see  [2]),  the 
functions  tpi,  <p2,  P3,  ■  ■  ■  constitute  an  orthogonal  basis  in  L2[a,  b]  with  respect  to  the  inner  product 
defined  by  the  weight  function  w.  Thus,  for  any  square  integrable  function  /  :  [a,  b]  — >  R  there 
exists  a  sequence  of  real  numbers  a±,  a2,  oq, . . .  such  that 


/(®)  =  '52aj(Pj(x)’ 

3= 1 


with 


and 


<X;  = 


f(x)  Pj(x)  w(x )  dx, 


■/3\\w  J  a 


UJ\\  w 


rb 

=  /  ip2Ax)w{x)dx. 

J  a 


(1.2) 


(1.3) 


(1.4) 


Given  a  collection  of  n  points  aq,  X2,  ■  ■  ■ ,  xn  on  [a,  b ],  it  is  often  desirable  to  determine  coefficients 
j3\ ,  /%,  •  •  • ,  Pn  such  that  for  k  =  1,  2, . . . ,  n 


f(xk)  =  £/%  <*(**)■ 

3= 1 

It  is  possible  to  choose  xi,  X2,  ■  ■  ■ ,  xn  such  that  the  linear  mapping  A  :  Mn  — ►  Mn, 


(1.5) 


A{f(xi),f(x2),...,f{xn))  =  (f3i,(h,...,(3n) 

is  well-conditioned.  The  mapping  in  equation  (1.6)  is  given  in  matrix  form  as 


/ w\  ip\(x\)  w2(pi(x2) 
wnp2{xi)  w2tp2(x2) 

\wi(pn(xi) 


Wn  Pl{Xn) 


W n  / 


f  f  Oi)\ 

(Pl\ 

/(*  2) 

~ 

02 

\Pn) 

(1.6) 


(1.7) 


where  w\ ,  w2 , . . . ,  wn  are  determined  from  the  weight  function  w  and  the  norm  of  the  functions 
ip i,  ip2,  •  •  • ,  p>n-  With  the  proper  choice  of  n  and  xi,  x2, . . . ,  xn,  not  only  is  A  well-conditioned,  but 
the  function  g  :  [a,  b]  — >  R,  given  by  the  formula 


s(x)  = 

3=1 


(1.8) 
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provides  an  accurate  and  stable  approximation  to  /  for  all  x  G  [a,  b\.  Additionally,  since  A  is 
well-conditioned,  so  is  its  inverse  A _1,  making  both  matrices  suitable  for  numerics.  The  direct 
application  of  A  and  A~l  each  require  0{n 2)  operations,  while  the  algorithm  presented  in  this 
paper  enables  the  numerical  application  of  A  and  A in  only  0(n  log  n)  operations.  Our  algorithm 
enables  an  accelerated  numerical  summation  of  the  functions  ipj{xk)  over  j  or  k.  For  reasons 
described  later  in  the  paper,  we  will  refer  to  our  procedure  as  the  “butterfly”  algorithm. 

Another  class  of  transforms  common  in  mathematical  physics  and  classical  mathematics  are 
integral  transforms.  An  integral  transform  I\  :  L2[a,  b]  —>  L2[c,d\  maps  functions  on  one  interval, 
[a,  b\,  to  functions  on  another  interval,  [c,  d] .  For  square  integrable  functions  /  :  [a,6]  — >  C  and 
k  :  [a,  b]  x  [c,  d]  — ►  C,  the  operator  K  takes  the  form 

(Kf)(y)  =  f  k(x,y)  f(x)dx.  (1.9) 

J  a 


The  function  k  is  referred  to  as  the  kernel  of  the  integral  transform.  One  may  desire  to  evaluate  the 

function  Kf  at  n  values  yi,  j/2, _ ,  yn-  Standard  methods  for  the  numerical  application  of  integral 

operators  replace  the  integral  in  equation  (1.9)  with  a  quadrature  consisting  of  appropriately  chosen 
points  xi, . . . ,  xm  and  weights  wi, ,  wrn  such  that  for  some  class  of  appropriately  chosen  functions 
/, 


m 

^Wi  k(xi,yj)  f(xi) 

i=  1 


k{x,yj)  f(x)dx 


(1.10) 


for  j  =  1,2,...,  77..  The  convergence  of  the  approximation  is  consistent  with  the  order  of  the 
quadrature  chosen.  The  computation  of  the  integral  in  equation  (1.9)  for  y\,y2,  ■  ■  ■  ,yn  has  thus 
been  reduced  to  a  matrix-vector  multiplication.  For  certain  choices  of  the  kernel  k.  we  are  able  to 
accelerate  this  multiplication  using  the  algorithm  of  this  paper. 

One  such  integral  transform  is  the  Fourier-Bessel  transform.  Fourier-Bessel  transforms  arise 
when  computing  the  Fourier  transform  of  a  radially  symmetric  function  on  the  disk.  This  occurs 
commonly  in  the  fields  of  optics,  acoustics,  geophysics,  etc.  Let  R  be  a  positive  real  number  and  v 
a  positive  integer.  Suppose  that  we  have  a  function  /  :  R2  — >  C,  given  by  the  formula 


f(x,y)  =  g(r)elu\ 


(1.11) 


where  (r,  9)  are  the  usual  polar  coordinates,  and  the  function  g  :  M+  — ►  K  is  such  that  g(r)  =  0  for 
r  >  R.  After  a  change  of  variables,  the  Cartesian  Fourier  transform 


(Rf)(u,v)  =  J  J  el{ux+vy)  f  (x,  y)  dx  dy 

R2 


(1.12) 


becomes 

where 

u  =  p  cos  ip,  x  =  r  cos  9, 

v  =  p  sin  'L,  y  =  r  sin0.  (1.14) 


i"  ell/'p  [  Ju(pr)g(r)rdr ,  (1.13) 

Jo 


The  integral  in  equation  (1.13)  is  referred  to  as  the  Fourier-Bessel  (or  Hankel)  transform  of  order 
n  of  the  function  g. 
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If  we  wish  to  evaluate  the  integral  in  equation  (1.13)  for  n  values  of  p,  pi,  p2, . . . ,  pn,  we  may 
discretize  the  integral  and  use  the  algorithm  of  this  paper  to  apply  the  resulting  matrix.  For  reasons 
described  in  Section  2.2,  p\,  p2,  ■  ■  ■ ,  pn  are  usually  chosen  to  be  the  first  n  positive  roots  of  Jv. 
Previous  methods  for  numerically  computing  Fourier-Bessel  transforms  include  using  an  exponential 
change  of  variables,  asymptotic  expansions  of  Bessel  functions,  and  projection  techniques  (see  [5], 
[12],  and  [17]).  These  algorithms  often  had  limitations  on  the  choices  for  p  in  equation  (1.13)  (due 
to  the  requirement  of  equispaced  points  for  FFTs),  the  requirement  of  sampling  g  on  an  exponential 
grid,  and  on  the  order  v  of  the  transform.  In  this  paper  we  eliminate  all  such  restrictions. 

It  should  be  pointed  out  that  the  algorithm  of  this  paper  is  very  similar  to  that  of  [16],  and  has 
been  motivated  by  the  latter. 

The  paper  has  the  following  format:  Section  2  gives  some  background  information  and  lemmas 
concerning  certain  special  functions,  and  the  approximation  of  low-rank  matrices.  In  Section  3  we 
build  the  analytical  apparatus  to  be  used  in  the  construction  of  the  algorithm  of  this  paper.  In 
Section  4  we  describe  our  algorithm.  Section  5  contains  numerical  examples  illustrating  the  perfor¬ 
mance  of  the  algorithm,  and  Section  6  summarizes  the  work  and  discusses  its  possible  extensions. 

2  Mathematical  and  numerical  preliminaries 

This  section  contains  definitions,  lemmas,  and  basic  mathematical  facts  that  are  used  in  the  re¬ 
mainder  of  the  paper.  For  any  positive  integer  n,  and  column  vector  v  £  Cn,  we  define  the  norm 
||u||  of  v  to  be  the  root-sum-square  ( l 2  norm)  of  the  entries  of  v,  that  is, 


n 


(2.1) 


where  Vj  is  the  jth  entry  of  v.  For  any  positive  integer  m  and  matrix  A  £  Cmxn,  we  define  the 
norm  ||A||  of  A  to  be  the  spectral  (Z2-operator)  norm  of  A, 

11^411  =  max  ^  ^  =  max  |u7j  (2-2) 

xGCn  1 1  £C  1 1  l<j<min(m,n) 


where  <7i,  02, . . . ,  crmin(m,n)  are  the  singular  values  of  A.  (Obviously,  the  norm  of  v  as  viewed  as 
an  n  x  1  matrix  is  equal  to  the  norm  of  v  as  viewed  as  an  n  x  1  column  vector.)  The  transpose 
of  matrix  A  will  be  denoted  by  At.  Unless  otherwise  stated,  we  denote  the  natural  logarithm  of 
x  £  M+  by  log  x.  The  largest  integer  that  is  less  than  or  equal  to  x  will  be  denoted  by  ,  he. 
is  the  floor  of  x.  We  use  L2[a,  b]  to  denote  the  space  of  square  integrable  functions  /  :  [a,  b ]  — >  C, 

L2[a,6]  =  |/:  f  \f(x)\2  dx  <  ooj  .  (2.3) 


2.1  Approximation  of  low-rank  matrices 

The  principal  numerical  tool  used  in  this  paper  is  the  approximation  of  a  series  of  low-rank  matrices 
via  interpolative  decompositions.  Lemma  2.1  below  restates  (in  a  slightly  different  form)  Theorem  3 
in  [4],  It  states  that  for  any  m  x  n  matrix  A  whose  approximate  rank  is  k,  there  exist  an  m  x  k 
matrix  B  whose  columns  constitute  a  subset  of  the  columns  of  A,  and  a  k  x  n  matrix  P ,  such  that 

1.  some  subset  of  the  columns  of  P  makes  up  the  k  x  k  identity  matrix, 
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2.  P  is  not  too  large,  and 

3.  B  P  —  A  is  small. 

Lemma  2.1  Suppose  that  m  and  n  are  positive  integers,  and  A  is  a  real  in  x  n  matrix.  Then,  for 
any  positive  integer  k  with  k  <  m  and  k  <  n,  there  exist  a  real  k  x  n  matrix  P ,  and  a  real  m  x  k 
matrix  B  whose  columns  constitute  a  subset  of  the  columns  of  A,  such  that 

1.  some  subset  of  the  columns  of  P  makes  up  the  k  x  k  identity  matrix, 

2.  no  entry  of  P  has  an  absolute  value  greater  than  1, 

3.  ||P||  <  y/k{n  —  k)  +  \  , 

4 ■  the  least  (that  is,  the  kth  greatest)  singular  value  of  P  is  at  least  1, 

5.  B  P  =  A  when  k  =  m  or  k  =  n,  and 

6.  \\B  P  —  t4||  <  y/k  (n  —  k)  +  1  (Jk+i  when  k  <m  and  k  <  n,  where  ak+i  is  the  (k  +  l)st  greatest 
singular  value  of  A. 

Remark  2.2  Properties  1,  2,  3,  and  4  in  Lemma  2.1  ensure  that  the  interpolative  decomposition 
B  P  of  A  is  numerically  stable.  It  should  also  be  observed  that  Property  3  follows  directly  from 
Properties  1  and  2,  and  Property  4  follows  directly  from  Property  1. 

Observation  2.3  Existing  algorithms  for  the  construction  of  the  matrices  B  and  P  in  Lemma  2.1 
are  computationally  expensive.  We  use  an  algorithm  to  produce  B  and  P  which  satisfy  conditions 
slightly  more  relaxed  than  those  in  Lemma  2.1,  but  still  sufficient  for  numerical  purposes.  We 
compute  B  and  P  such  that 

1.  some  subset  of  the  columns  of  P  makes  up  the  k  x  k  identity  matrix, 

2.  no  entry  of  P  has  an  absolute  value  greater  than  2, 

3.  ||P||  <  ^/4/c  ( n  —  k)  +  1, 

4.  the  least  (that  is,  the  kth  greatest)  singular  value  of  P  is  at  least  1, 

5.  B  P  =  A  when  k  =  rn  or  k  =  n,  and 

6.  || BP  —  t4||  <  \J Ak  (n  —  k)  +  1  ct^+i  when  k  <  m  and  k  <  n,  where  a^+i  is  the  ( k  +  l)st 
greatest  singular  value  of  A. 

Thus,  for  any  positive  real  number  e,  the  algorithm  can  identify  the  least  k  such  that  \\B  P— yl||  ~  e. 
Furthermore,  there  exists  a  real  number  C  such  that  the  algorithm  computes  both  B  and  P  using 
at  most  Ckmn  log(n)  floating-point  operations  and  Ckmn  floating-point  words  of  memory  (see  [4], 
[11],  and  [15]). 
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2.2  Bessel  functions 


This  section  contains  a  brief  summary  of  various  facts  regarding  Bessel  functions,  and  defines  the 

Fourier-Bessel  transform  (see  [1]  and  [23]).  For  any  integer  is  >  0,  we  denote  by  Ju  the  Bessel 

(12) 

function  of  the  first  kind  of  order  is,  Yu  the  Bessel  function  of  the  second  kind  of  order  is ,  and  HI  ' 
the  Hankel  function  of  the  first  and  second  kinds  of  order  is.  For  any  z£  C, 

Hi1\z)  =  Ju(z)  +  iYu(z),  (2.4) 

H(?\z)  =  Jv(z)-iYv(z).  (2.5) 

The  functions  Ju  and  Yv  are  two  linearly  independent  solutions  to  the  differential  equation 


d  (  dw 
dz  V  dz 


+  (z~- r=0’ 


given  by  the  formulas 


Jv{z)  =  —  cos (z  sin  9  —  isO)  dd , 

^  Jo 


Yu(z)  =  —  f  sin(2:  sin  9  —  isO)  d6  —  — 


{evt  +  e  ut  cos  istt)  e 


—z  sinht 


where  we  require  in  (2.8)  that  Re(z)  >  0.  The  functions  Jq,  J\,  J2,  ■  ■  ■  are  known  to  satisfy  the 
recurrence 

Ju+i{z)  =  —Ju(z)  -  Jv-i(z).  (2.9) 

Bessel  functions  of  the  second  kind  (and  thus  Hankel  functions  of  the  first  and  second  kind)  satisfy 
the  same  recurrence  as  Bessel  functions  of  the  first  kind,  as  in  (2.9). 

Remark  2.4  Recurrence  (2.9)  can  be  used  for  the  numerical  evaluation  of  Bessel  functions.  For 
x  £  R+,  given  initial  values  Yq(x)  and  Y\(x),  the  recurrence  is  stable  and  can  be  used  to  accurately 
calculate  Y2{x),  Y3(x), . . .  .  On  the  other  hand,  given  initial  conditions  Jq(x)  and  J\(x),  recur¬ 
rence  (2.9)  is  unstable.  However,  as  is  well  known,  given  initial  conditions  Jn{x)  and  Jn+i(x)  for 
some  large  N,  the  backwards  recurrence 


Ju— 1(2)  —  ^  Jv(z)  Ju+ 1(2) 

is  stable  and  can  be  used  to  calculate  Jjv-i(x),  . . . ,  J\  (x),  Jq(x)  (see,  for  example,  [1]). 


(2.10) 


For  a  fixed  order  is,  we  use  the  method  found  in  [9]  to  numerically  integrate  differential  equation  (2.6) 
to  evaluate  Bessel  functions  of  the  first  kind.  For  any  A  £  C,  after  a  change  of  variable  equation  (2.6) 
becomes 

(2.11) 


which  has  linearly  independent  solutions 


—  \f~%  Jl/  (Az)  , 


(2.12) 


%{z)  =  sTzYv{\z). 


(2.13) 


It  turns  out  that  for  any  v  >  0,  and  any  subinterval  [0,  R]  C  M+,  there  exists  a  choice  of  Ai,  A2,  A3, . . . 
such  that 


[R 

/  r  Ju(Xir)  Ju(\jr)  dr  =  0 
Jo 


(2.14) 


for  i  /  j.  Indeed,  for  any  R  >  0  and  integer  u  >  0,  let  the  real  numbers  p\  <  P2  <  po  <  ■  ■  ■  be  all 
the  positive  zeros  of  the  function  g  :  M+  — >  M,  given  by  the  formula 


g(p)  =  Jv{2npR). 

Let  the  real  valued  functions  ./,,,  1 .  Ju Ju, 3, ...  on  [0,  R]  be  given  by  the  formula 

\/2  y/r  Jv(2Tipkr) 


Jv,k{r)  — 


R  Ju+i(2npkR) 


(2.15) 


(2.16) 


It  is  easily  verified  (see  [23])  that  the  functions  Jup,  Jup,  Ju, 3, . . .  are  dense  in  L2[0,  R]  and  orthonor¬ 
mal,  i.e.  for  any  j,  k  >  0 


rR 


Ju,j(r)  Jv,k(r)  dr  =  Sjk, 


(2.17) 


where  6jk  is  the  Dirac  delta  function.  To  numerically  calculate  the  roots  pi,  P2,  ps,  ■  ■ we  use  the 
method  found  in  [9].  Since  the  functions  JVji,  Jup,  3, . . .  are  dense,  we  may  express  any  square 
integrable  function  /  :  [0,  R]  — ►  M  in  the  form 


/O’)  =  ^22  Pk 


(2.18) 


k= 1 


where  the  coefficients  (3%,  (3 3 , . . .  are  given  by  the  formula 


rR 


Pk=  Jv,k{r)  f(r)dr 
Jo 

V2  1 


rR 


R  pkR)  Jo 


fr  Ju(2npkr)  f(r)  dr. 


(2.19) 


The  number  /?£  is  known  as  the  kttl  Fourier-Bessel  coefficient  of  order  v  of  the  function  /.  We  call 
the  mapping  Avn  :  L2[ 0,  R]  — >  Mn, 


(2.20) 


the  Fourier-Bessel  transform  of  order  v  and  size  n.  To  calculate  the  Fourier-Bessel  transform 
numerically,  it  is  necessary  to  evaluate  the  integral  in  (2.19)  for  k  =  1,2, ...  ,n.  We  numerically 
calculate  this  integral  using  an  (n  +  2)-point  trapezoidal  quadrature.  Setting  h  =  22  ■>  trapezoidal 
integration  gives 


Pk 


s/2  1 

R  Jv+i^npkR) 


h 


'  n- 1-1 


22  Jv(2npkjh)  f(jh ) 

j= 0 


s/2  1 

R  J,,+i(2TrpkR) 


h  22  VJh  Ju{2npkjh)  f(jh ), 
3= 1 


±VRM2npkR)f(R)j 


since  for  all  k  >  0 

Ju(2npkR)  =  0. 


(2.21) 


(2.22) 
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In  matrix  notation 


where 


P  ~  h  -r-  SJv  TJu  f , 

P=(Pl,PZ,-,Pn)\ 
f  =  (f(h),f{2h),...  ,f{nh)y  , 


(2.23) 

(2.24) 

(2.25) 


and 


Sj„  = 


l 

Jv+l(2irpiR) 


1 

J„+1(2np2R) 


Jv+l(2irpnR) ) 


(2.26) 


Tj, 


(  VhJ„(2Trpih) 


y/nh  Ju(2npinhy 


(2.27) 


\Vh  Ju(2npnh)  ■■■  y/nh  J u(2tt  pnnh) ) 

The  algorithm  we  describe  in  Section  4  accelerates  the  application  of  the  matrix  Tj„  in  equa¬ 
tion  (2.27). 


Remark  2.5  If  we  contrast  equation  (2.19)  with  equation  (1.13),  it  appears  as  though  the  Fourier- 
Bessel  transform  has  been  defined  with  two  different  kernels;  the  integrand  in  equation  (2.19) 
contains  a  yfr  and  the  integrand  in  equation  (1.13)  contains  an  r.  These  are  merely  two  different 
conventions,  and  the  difference  will  not  affect  the  speed  with  which  the  algorithm  of  Section  4 
applies  the  matrix  resulting  from  their  discretization.  Multiplication  (or  division)  by  an  extra  \/r 
in  the  integrand  is  a  well-conditioned  diagonal  transformation  and  has  a  negligible  effect  on  the 
numerical  rank  of  any  submatrix. 


In  addition  to  computing  Fourier-Bessel  coefficients,  one  may  wish  to  perform  other  numerical 
calculations  involving  Bessel  functions,  including  evaluating  sums  of  Bessel  and  Hankel  functions 
over  varying  order.  These  expansions  are  analogous  to  expansions  in  orthogonal  polynomials  (see 
[21]  and  [22]).  For  and  complex  ao,  aq, . . . ,  an_  i,  it  may  be  desirable  to  evaluate  the  sums 


n—  1 


g(x)  =  y^ctfc  Jfc(x) 

k=0 

(2.28) 

n—  1 

[x)  =Y^OikH^'2\x). 
k= 0 

(2.29) 

and 


To  evaluate  the  function  g  at  real  points  x\,  X2,  ■  ■  ■ ,  xn  we  must  apply  the  matrix  Ej , 

(  Jq(x  l)  •••  Jn-i(xi)' 


Ej  = 


(2.30) 


\«/o(a 


Jn— l  (*c»i)  i 


to  the  vector  (ao, aq, . . . , an_i)t.  To  evaluate  the  function  w  at  real  points  X\,X2,  ■  ■  ■  ,xn  we  must 
apply  the  matrix  Eh, 

(h^2\Xi)  •••  H^(Xly 


Eh  = 


(2.31) 


W’2)(*n)  •••  H^S(xn)j 
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to  the  vector  (ao,  cci, . . . ,  It  will  be  shown  in  Section  5  that  the  algorithm  of  this  paper  will 

accelerate  the  application  of  matrices  Ej  and  Eh- 

For  fixed  argument  and  large  order,  Bessel  functions  of  the  first  kind  have  a  predictable  behavior, 
used  in  proofs  contained  in  Section  3.  Lemma  2.6  below  shows  that  for  R  >  10  and  large  v,  Jv(r ) 
will  be  very  small  for  0  <  r  <  R.  It  is  reproduced  from  [18]. 

Lemma  2.6  For  any  real  R  >  10  and  0  <  e  <  0.1, 

JU(R)  <  e  (2.32) 


for  any 

v  >  (I?1/3  +  air1/3)3, 


where 


Thus,  Jv(r)  <  e  for  any  0  <  r  <  R. 


(2.33) 

(2.34) 


Remark  2.7  The  following  theorem  provides  an  error  estimate  for  the  approximation  of  an  integral 
by  the  n-point  trapezoidal  rule  (see  [6]),  and  is  a  motivation  for  the  popularity  of  trapezoidal 
integration  in  such  environments. 


Theorem  2.8  (Euler-Maclaurin  Summation  Formula)  For  any  n  >  1,  and  function  ip  : 
[a,  b]  — ►  M,  let 

h  =  ^4,  (2.35) 

n  —  1 

and 

LM  =  h  (g  ip(a  +  jh)  -  ,  (2.36) 

If  the  function  ip  has  2k +  2  continuous  derivatives,  then  for  some  £  G  [a,  b],  the  error  of  the  n-point 
trapezoidal  quadrature  rule  Tn  for  the  integration  of  p  on  [a,  b]  is  given  by  the  formula 


r-0 

Tn{p)  -  /  <p(x)  dx  =  y^j 

Jo,  l=l 


k  h2lB ■ 


(2/)! 


21  (P'-'m  -  P‘-\a))  + 


(2k  +  2)! 


where  Bj  is  the  ith  Bernoulli  number. 


(2.37) 


In  many  cases,  the  function  /  :  [0,  R]  — ►  M  whose  Fourier-Bessel  transform  is  to  be  taken  is  zero  at 
the  point  x  =  R,  along  with  many  of  its  derivatives.  Furthermore,  for  any  integer  v  >  2,  it  follows 
from  the  formula  (see  [10]) 


./ ty(x)  — 


X 


'£ 

k= 0 


(-1)' 


k\  T(u  +  k  +  1) 


(2.38) 


that  the  function  Ju  is  not  only  zero  at  the  point  x  =  0,  its  first  v  —  1  derivatives  are  also  zero  at  the 
point  x  =  0.  These  two  observations,  when  combined  with  Theorem  2.8,  show  that  the  trapezoidal 
rule  converges  rapidly  when  used  for  the  approximation  of  Fourier-Bessel  coefficients. 


2.3  Fourier  series 


The  following  information  concerning  complex  exponentials  can  be  found  in  any  standard  Fourier 
analysis  textbook  (see  [14],  for  example).  We  also  describe  discrete  Fourier  transforms  for  noneq- 
uispaced  data,  and  refer  the  reader  to  [7]  and  [8]  for  more  information. 

A  continuous  periodic  function  /  :  [ — 7r,  7t]  — >  C  can  be  represented  by  its  Fourier  series  as 

OO 

f(x)=  ckeikx,  (2.39) 

k=—oo 

where  c *.  is  given  by  the  formula 

ck  =  ±  f  f{x)e~ikxdx.  (2.40) 

Without  loss  of  generality,  assume  for  the  moment  that  n  >  0  is  an  even  integer.  If  /  is  supplied 
as  a  discrete  function,  known  only  at  n  equispaced  points  aq,  x2,  •  •  • ,  xn,  Xj  =  —7 r  +  7r^)~1^ ,  then  we 
can  represent  /  at  these  points  using  the  formula 

n 

f(xj)  =  '£akei“^,  (2.41) 

k=i 

where  =  Tp  +  k  —  1  and  is  given  by 

1  n 

ak  =  -YJf{xj)e-^.  (2.42) 

3= 1 

We  refer  to  the  linear  operator  Ap  :  Mn  — >  Mn, 


AF(f(x l),  /(x2),  •  •  • ,  f(xn ))  =  (ai,  a2, . . . ,  an),  (2.43) 


as  the  forward  discrete  Fourier  transform.  Determining  the  values  of  /  at  aq,  a;2, . . . ,  xn  given  the 
coefficients  a  i ,  q2  , . . .  ,  an  will  be  referred  to  as  the  inverse  discrete  Fourier  transform.  The  Fast 
Fourier  Transform  (FFT)  algorithm  for  computing  the  forward  and  inverse  transforms  is  widely 
known  (see,  for  example  [6]).  Note  that  the  FFT,  as  well  as  equation  (2.42),  relies  on  the  fact  that 
xi,  X2,  ■  ■  ■ ,  xn  are  equispaced  and  wi,w2,...,w„  are  integers  ranging  from  ^  to  ^  —  1. 

Suppose  now  that  we  wish  to  evaluate  sums  of  the  form 

n 

(Xk  =  f(xj)  e~iuJkXj  (2.44) 

3= 1 


where  aq,  x2, . . . ,  xn  is  an  arbitrary  set  of  n  real  numbers  on  [— n,  7r]  and  aq,  w2, . . . ,  ujn  is  an  arbitrary 
set  of  n  real  numbers  on  [Tip,  p],  neither  of  which  is  equispaced.  Sums  of  this  type  are  known  as 
discrete  Fourier  transforms  for  nonequispaced  data.  The  algorithm  developed  in  Section  4  can  be 
used  to  calculate  discrete  Fourier  transforms  for  nonequispaced  data,  as  well  as  equispaced  data. 
For  a  given  n,  this  requires  that  we  apply  the  matrix  Tp, 


0—lU)\X\ 


TF  = 


\e 


D—lUJ\Xn  ' 


o  lUJri^n 


(2.45) 


to  the  vector  (/(aq), /(x2), . . . ,  /(xn))*,  where  oq,o;2, . . .  ,u:n  and  xi,  x2, . . . ,  xn  are  given  (equi¬ 
spaced  or  nonequispaced).  As  we  show  in  Section  5,  the  application  of  the  matrix  Tp  will  be 
accelerated  by  the  algorithm  of  this  paper. 
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2.4  Generalized  Gaussian  quadratures 

This  section  contains  information  concerning  generalized  Gaussian  quadratures  (see  [13]  and  [25] 
for  more  detailed  information).  Given  two  functions  /  :  [a,b]  — ►  M  and  w  :  [ a,b\  — ►  R+,  we  may 
wish  to  approximate  the  integral 


f(x)  w(x)  dx 


(2.46) 


by  a  sum.  An  n-point  quadrature  rule  for  the  approximation  of  integral  (2.46)  consists  of  nodes 
Xi,  X2,  ■  ■  ■ ,  xn  G  [a,  b ] ,  weights  w\,  W2,  ■  ■  ■ ,  wn  >  0,  and  is  given  by  the  sum 


3  = 1 


(2.47) 


For  example,  the  formula  for  the  well  known  n-point  trapezoidal  rule  is 

fb  "-1  h 

/  f{x)dx  «  ^hf(a  +  jh)  -  -(/(a)  +  /(&)), 

Ja  j= o 


(2.48) 


where  h  =  Generalized  Gaussian  quadratures  are  quadratures  for  which  the  approximation 


f{x)  w(x)  dx 


3= 1 


(2.49) 


is  exact  for  a  choice  of  2 n  different  functions  /.  We  now  give  a  more  concise  definition. 

Definition  2.9  An  n-point  quadrature  is  referred  to  as  an  n-point  generalized  Gaussian  quadrature 
with  respect  to  the  weight  function  w  :  [a,  b]  — »  M+  and  functions  /i,  fa, . . . ,  f^n  '■  [a,  b]  — *  R  if 

rb  n 


fk(x)w(x)  dx  = 


=  Y, 


fk(Xj 


(2.50) 


3= i 


for  k  =  1,2,  ...,2  n.  The  nodes  xi,  X2,  ■  ■  ■ ,  xn  are  referred  to  as  Gaussian  nodes  and  the  weights 
w\,  W2,  ■  ■  ■ ,  wn  >  0  are  referred  to  as  Gaussian  weights,  both  of  which  are  unique. 


Classical  ?r-point  Gaussian  quadratures  are  those  that  exactly  integrate  polynomials  up  to  degree 
2 n  —  1  with  respect  to  some  weight  function  w.  The  nodes  for  these  Gaussian  quadratures  turn 
out  to  be  the  roots  of  the  polynomial  of  degree  n  in  the  class  of  orthogonal  polynomials  associated 
with  the  weight  function  w.  Relevant  orthogonal  polynomials  are  discussed  in  Sections  2. 5-2. 8. 

Table  1  lists  some  common  weight  functions,  their  associated  orthogonal  polynomials,  and  the 
quadrature  weights  for  the  classical  n-point  Gaussian  quadrature  formula 


f(x)  w(x)  dx 


Y.WjfiXj). 

3= 1 


(2.51) 


The  column  labeled  “[a,  6]”  denotes  the  interval  that  the  weight  function  is  defined  on.  The  column 
“w(x)n  gives  the  formula  for  the  weight  function.  The  column  “associated  polynomials”  lists  the 
class  of  orthogonal  polynomials  associated  with  the  given  interval  and  weight  function.  The  column 
uWj”  lists  the  formula  for  the  jth  weight  in  the  classical  n-point  Gaussian  quadrature.  In  all  cases, 
Xj  is  used  to  denote  the  jth  root  of  the  associated  polynomial  of  degree  n. 
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[a,  b]  w(x)  Associated  polynomials  Wj 


[-1,1]  1 

Legendre 

2  (1  —  xj)~L  Pnixj)~2 

[-1,1] 

(1-x2)-1/2 

Chebyshev 

7 r  n^1 

(— oo, oo) 

g-I2 

Hermite 

'L'n(Xj)  2 

[0,  oo) 

e~x 

Laguerre 

Table  1:  Various  weight  functions,  their  associated  orthogonal  polynomials,  and  discrete  weights 
for  ro-point  Gaussian  quadratures. 


2.5  Legendre  polynomials 

This  section,  and  Sections  2. 6-2. 8,  contain  information  about  classical  orthogonal  polynomials  (see 
[10]  and  [20]).  For  an  integer  n  >  0,  we  denote  by  Pn  the  Legendre  polynomial  of  degree  n.  As  is 
well  known,  Pn  is  the  solution  to  the  differential  equation 

({l  -  x2)  +  n  (n  +  1)  y  =  0  (2.52) 

ax  \  ax J 


that  is  bounded  on  the  interval  [—1,1].  Legendre  polynomials  are  orthogonal  on  the  interval  [—1,1] 
with  respect  to  the  inner  product  defined  by  the  weight  function  w(x)  =  1,  i.e.  for  any  m,n  >  0, 


Pm{x)  Pn{x)  dx 


2 

2ro  +  1 


(2.53) 


Legendre  polynomials  satisfy  the  recurrence 


(n  +  1)  Pn+i(x)  =  (2 n  +  1  )xPn(x)  -  nP„_i(i)  (2.5 4) 

with  initial  conditions 

Po(x)  =  1,  (2.55) 

Pi(x)  =  x.  (2.56) 

Differentiating  (2.54)  yields  a  recurrence  for  the  derivatives  of  Legendre  polynomials,  which  are 

used  to  calculate  Gaussian  weights  (see  Section  2.4).  The  recurrence  (2.54)  is  well  known  to  be 
stable  and  can  be  used  to  calculate  Legendre  polynomials  of  all  orders. 

Since  Legendre  polynomials  constitute  an  orthogonal  basis  in  L2[— 1,1],  for  any  square  integrable 
function  /  :  [— 1, 1]  — >  R  there  exist  real  cxq,  aq,  ct2,  ■  ■  ■  such  that 

OO 

fix)  =  ak  Pkix)-  (2-57) 

k= o 

In  most  numerical  applications,  the  series  in  equation  (2.57)  is  truncated  after  n  terms,  for  some 
appropriately  chosen  n.  The  resulting  finite  sum  is  a  stable  interpolating  polynomial  of  degree 
n  —  1,  and  it  is  commonly  used  for  accurate  numerical  evaluation  of  /.  The  function  /  is  thus 
approximated  by  a  polynomial  p  :  [—1,1]  — >  R,  given  by  the  formula 


n—1 

p{x)  =  y ^akPkjx). 

k= 0 


(2.58) 
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If  /  has  n  continuous  derivatives,  then  for  any  x  £  [—1,1]  the  error  of  this  approximation  is  of  the 
form 

on  „  | 

|/(x)-p(;c)|=p^||/(”)K)|,  (2.59) 

for  some  £  £  [—1, 1].  We  recall  from  Section  2.4  that  the  n-point  Gaussian  quadrature  for  weight 
function  w(x)  =  1  on  the  interval  [—1,1]  has  nodes  xi,X2,  ■  ■  ■  ,xn  equal  to  the  roots  of  Pn  and 
weights  w\,  W2,  ■  ■  ■ ,  wn  given  by 

2 

Wj  =  - 7TT - 77 - -77.  (2.60) 

J  (1  -x^)P'n{XjY  1  j 

Applying  this  quadrature  to  products  of  Legendre  polynomials  yields  the  identity 


2 

wi  Pk(xj)  Fi 0 xi )  =  2kT I  ^ 


(2.61) 


for  all  0  <  k,  l  <  n  —  1.  If  /  is  tabulated  at  the  nodes  X\,X2,  ■  ■  ■  ,xn,  the  expansion  coefficients  in 
equation  (2.58)  are  given  by  the  formula 


ak  =  2k  +  1  ^2  Wj  f(xj)  Pk(Xj). 


(2.62) 


For  an  integer  n  >  0,  we  define  the  linear  operator  Ap  : 


Ap  {f{x  1),  f(x2), f{xn))  =  (a0, an_i) , 


(2.63) 


as  the  Legendre  transform  of  size  n.  Ap  converts  values  of  /  at  the  roots  of  Pn  into  coefficients  in 
a  Legendre  polynomial  expansion  via  equation  (2.62).  In  matrix  notation, 


Ap  =  Sp  Tp  Wp , 


(2.64) 


where 


SP  = 


TP  = 


Po{xi) 


Pn— 1  (-^l) 


Po(xn) 


Pn—l(xn) 


(2.65) 


(2.66) 


(2.67) 


The  algorithm  presented  in  Section  4  accelerates  the  application  of  the  matrix  Tp,  and  thus  accel¬ 
erates  the  evaluation  of  Legendre  polynomial  expansion  coefficients. 
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2.6  Chebyshev  polynomials 

This  section  contains  facts  concerning  Chebyshev  polynomials  that  will  be  used  in  Sections  3  and 
5.  For  an  integer  n  >  0,  we  denote  by  Tn  the  Chebyshev  polynomial  of  degree  n.  Tn  is  the  solution 
on  [—1,1]  to  the  differential  equation 


d_ 

dx 


y  =  o, 


(2.68) 


and  is  given  explicitly  by  the  formula 


Tn(x)  =  cos(n  arccos  x). 


(2.69) 


Chebyshev  polynomials  are  orthogonal  on  the  interval  [—1,1]  with  respect  to  the  weight  function 
w(x)  =  (1  —  x2)-1/2,  i.e.  for  any  rn,  n  >  0, 


Tm(x)  Tn(x) 


dx 

\/l  —  x2 


ifm/n 
if  m  =  n  0  . 

if  m  =  n  =  0 


Chebyshev  polynomials  also  satisfy  the  recurrence 


Tn+ i(x)  =  2 xTn(x)  -  Tn-\(x) 


(2.70) 


(2.71) 


with  initial  conditions 

T0(x)  =  1,  (2.72) 

T1(x)  =  t.  (2.73) 

Recurrence  (2.71)  is  stable  due  to  the  fact  that  \Tn\  <  1  for  all  n  >  0,  and  can  thus  be  used  to 
generate  Chebyshev  polynomials  numerically. 

Analogous  to  Legendre  polynomials,  Chebyshev  polynomials  are  dense  in  L2[—  1, 1],  Thus,  for 
any  square  integrable  function  /  :  [—  1, 1]  — >  R,  there  exist  real  ao,  oq,  •  •  •  such  that 


f{x)  =  ^akT^x). 
k= o 


(2.74) 


In  most  numerical  applications,  we  approximate  the  function  /  using  a  polynomial  p  :  [— 1, 1]  — ►  R 
constructed  from  a  truncated  version  the  sum  in  (2.74),  such  that 


n— 1 

f(x)  &  p(x)  =  ^2  ak  Tk(x). 

k= 0 


(2.75) 


If  /  has  n  continuous  derivatives,  then  for  any  x  G  [—1, 1]  the  error  of  approximation  (2.75)  is  given 
by  the  formula 

\f(x)  -P(x)\  =  f2J^  (2.76) 

for  some  ^  G  [—1,1].  We  recall  from  Section  2.4  that  the  n-point  Gaussian  quadrature  for  weight 
function  w(x)  =  (1  —  x2)^1/2  on  the  interval  [—1, 1]  has  nodes  xi,X2, . . .  ,xn  equal  to  the  roots  of 
Tn,  given  explicitly  by 


(2 j  -  1)7 r 
2  n 


(2.77) 
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and  weights  w\ ,  W2 , . . . ,  wn  given  explicitly  by 

7 r 


Applying  this  quadrature  to  products  of  Chebyshev  polynomials  yields  the  identities 


(2.78) 


T:  Wj  T0(xj )2  =  7T,  (2.79) 

3= 1 

n  _ 

y^WjTk(xj)Ti(xj)  =  ^ 4/ ,  (2.80) 

i=i 

for  all  1  <  A;,  l  <  n  —  1.  If  /  is  tabulated  at  the  points  xi,X2, . . .  ,xn,  the  expansion  coefficients  in 
equation  (2.75)  are  given  by  the  formulas 


« 0  — 


n 

-£/(**)> 


j=i 


Ofc  =  -Y ^  f{x j)Tk{xj), 
i=i 

for  fc  =  1,  2, . . . ,  n  —  1.  For  an  integer  n  >  0,  we  define  the  linear  operator  Ax  : 


(2.81) 

(2.82) 


(/(®i),  f(x 2),  •  •  • ,  /(xn))  =  (a0,  ai, . . . ,  an_i) , 


(2.83) 


as  the  Chebyshev  transform  of  size  n.  Ax  converts  values  of  /  at  the  roots  of  Tn  into  coefficients 
in  a  Chebyshev  polynomial  expansion  using  equations  (2.81)  and  (2.82).  In  matrix  notation, 


Ax  =  —  Sx  Tx , 

n 


where 


ST  = 


/I 


V 


/  T0(si) 


Tr  = 


V 

T0(xn ) 


(2.84) 


(2.85) 


(2.86) 


\rn_i(xi)  •••  rn_i(x„)y 

The  algorithm  of  Section  4  will  enable  the  accelerated  application  of  the  matrix  Tx  in  equa¬ 
tion  (2.86). 


Remark  2.10  Although  the  algorithm  of  this  paper  accelerates  the  calculation  of  the  expansion 
coefficients  ao,  aq, . . . ,  ccn_i  in  equation  (2.75)  via  the  application  of  the  matrix  Tx,  it  is  well 
known  that  due  to  the  connection  between  trigonometric  polynomials  and  Chebyshev  polynomials 
(see  identity  (2.69)),  these  expansion  coefficients  can  also  be  calculated  rapidly  via  the  use  of  an 
FFT.  When  the  basis  functions  are  Chebyshev  polynomials  (or  complex  exponentials)  and  the 
quadrature  nodes  x±,  X2,  ■  ■  ■ ,  xn  are  the  roots  of  Tn  (or  equispaced),  the  use  of  an  FFT  to  calculate 
the  expansion  coefficients  is  preferable,  and  faster  than  the  algorithm  of  this  paper.  However, 
for  nonequispaced  nodes  x\,  x%,  ■  ■  ■ ,  xn,  the  algorithm  of  this  paper  becomes  a  new  version  of  the 
nonequispaced  FFT. 
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The  integrals  in  Lemma  2.11  below  are  used  in  Section  3.  These  identities  are  a  restatement  of 
Formulas  7.355.1  and  7.355.2  in  [10]. 

Lemma  2.11  For  any  integer  n  >  0,  and  any  real  number  a  >  0, 

f 1  r/o1  77- 

/  r2n+i(x)  sinax  =  =  (~l)n  -  J2n+i(a),  (2.87) 

Jo  V 1  —  Xz  2 

/  T2n(x)  cos  ax  “  =  (-1)"  -  J2n(a),  (2.88) 

Jo  V 1  -  x2  2 

where  Jn  is  the  Bessel  function  of  the  first  kind  of  order  n. 


2.7  Hermite  polynomials 

For  an  integer  n  >  0,  we  denote  by  Hn  the  Hermite  polynomial  of  degree  n,  which  is  the  solution 
on  R  to  the  differential  equation 


4-  (e~x )+2ne-x~  y  =  0, 


dx 


dx 


given  by  the  formula 


(2.89) 


(2.90) 


Hermite  polynomials  are  orthogonal  on  R  with  respect  to  the  weight  function  w{x)  =  e  x  ,  i.e.  for 
any  m,  n  >  0, 

/*oo 

/  Hm(x)  Hn(x)  e~x  dx  =  2n  n!  6mn.  (2-91) 


Furthermore,  Hermite  polynomials  satisfy  the  recurrence 

Hn+i(x)  =  2 xHn(x)  -  2nHn-i(x) 

with  initial  conditions 


(2.92) 


Ho{x)  =  1,  (2.93) 

H\(x)  =  x.  (2.94) 

Differentiating  (2.92),  we  obtain  a  recurrence  for  the  derivatives  of  Hermite  polynomials,  which 

are  used  to  calculate  the  weights  of  the  Gaussian  quadrature  corresponding  to  the  weight  function 
2 

w(x)  =  e~x  (see  Section  2.4).  The  use  of  Hermite  polynomials  in  numerical  applications  is  prohib¬ 
ited  due  to  poor  scaling  for  large  values  of  their  argument,  i.e.  for  large  |x|,  \Hn[x)\  is  very  large, 
for  all  n  >  0.  Thus,  it  is  necessary  that  we  instead  use  normalized  Hermite  functions,  given  by  the 
formula 


K{x)  = 


1 


—  —  / — r 

7r 4  2 2  n\ 


Hn(x)e  2  . 


(2.95) 


Normalized  Hermite  functions  form  an  orthonormal  basis  in  L2(R),  and  satisfy  the  differential 
equation 

d2y 


dx2 


+  (2 n  +  1  —  x2)  y  =  0. 


The  functions  h,Q,  h\ ,  h2r. . .  can  be  calculated  numerically  using  the  recurrence 

hn+i{x)  =  \l  ~[  xhn(x )  -  « /  H  hn—\(x) 


n  +  1 


n  +  1 


(2.96) 


(2.97) 
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with  initial  conditions 


(2.98) 


h0(x)  = 


1  -« 


i 

7T  4 


e  2 


M®)  =  2  • 

7T  4  \/2 

Since  Hennite  functions  are  dense  in  L2(M),  for  any  square  integrable  function  /  : 
exist  real  coefficients  ao,  aq,  a2,  ■  •  •  such  that 


f(x)  =  M®)- 


(2.99) 
R  there 

(2.100) 


k=0 


Most  numerical  applications  involving  Hermite  polynomials  use  a  truncated  version  of  the  sum  in 
equation  (2.100)  to  provide  an  approximation  to  /,  so  that 


n—  1 


/(*)  ~  S  (®) 


(2.101) 


k=0 


We  recall  from  Section  2.4  that  the  n-point  Gaussian  quadrature  for  weight  function  w(x)  =  e 
on  M  has  nodes  X\,  X2,  ■  ■  ■ ,  xn  equal  to  the  roots  of  Hn  and  weights  w\,  W2,  ■  ■  ■ ,  wn  given  by 


Wj  = 


s/t:  2n+1  n\ 


J  Kixj? 

If  this  quadrature  is  applied  to  products  of  Hermite  polynomials,  we  obtain  the  identity 


y:  Wj  Hk(xj )  Hi(xj)  =  s/tt  2k  k\  Ski 

3= 1 


(2.102) 


(2.103) 


for  all  0  <  k,  l  <  n  —  1.  A  straightforward  algebraic  manipulation  shows  that  (2.103)  is  equivalent 
to 


y:  Wj  hk{xj)  hi(xj )  =  Ski, 

3= 1 


where 


re,-  = 


(2.104) 


(2.105) 


J  K  {Xjr 

In  this  paper,  we  require  the  roots  of  Hn  for  use  in  computing  the  expansion  coefficients  in  equa¬ 
tion  (2.101)  and  quadrature  weights  in  equation  (2.105).  These  roots  x\,  x-i,  ■  ■  ■ ,  xn  can  be  numer¬ 
ically  calculated  using  the  method  contained  in  [9];  once  calculated,  ak  is  given  by  the  formula 


otk  =  y Iwj  f(xj )  hk(xj 

3= 1 


(2.106) 


for  k  =  0, 1, 2, . . . ,  n  —  1.  For  an  integer  n  >  0,  we  define  the  linear  operator  Ah  :  Mn  — >  M”, 

Ah  (f(x l),  f(x2),  •  •  • ,  f{xn))  =  («o,  «i,  •  •  • ,  «n-i) ,  (2.107) 

as  the  Hermite  function  transform  of  size  n.  Ah  converts  values  of  /  at  the  roots  of  Hn  into 
coefficients  in  a  Hermite  function  expansion  using  equation  (2.106).  In  matrix  notation, 


Ah  =  WhTh , 


(2.108) 
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where 


WH  = 


(2.109) 


(  ho(xi)  •••  h0(xn)  \ 

Th=  :  ;  •  (2.110) 

\hn~  i(xi)  •••  hn-i(xn)J 

The  algorithm  described  in  Section  4  will  accelerate  the  application  of  matrix  Tjj  in  equation  (2.110). 


2.8  Laguerre  polynomials 

For  an  integer  n  >  0,  we  denote  by  Ln  the  Laguerre  polynomial  of  degree  n,  which  is  the  solution 
on  R+  to  the  differential  equation 


given  by  the  formula 


l~(xe  x-T~\+ne  xy  =  o, 

ax  \  ax  J 


(2.111) 


(2.112) 


Laguerre  polynomials  form  an  orthonormal  basis  in  L2(R+)  with  respect  to  the  weight  function 
w(x)  =  e~x,  i.e.  for  any  m,n  >  0, 


Lmix)  Tn(x)  e  dx  —  dmn. 


(2.113) 


Laguerre  polynomials  satisfy  the  recurrence 


with  initial  conditions 


(n  +  1)  Ln+i(x)  =  (2n  +  1  -  x)  Ln(x)  -  nLn_i(x) 


L0(x)  =  1, 


L\(x)  =  1  —  x. 


(2.114) 

(2.115) 

(2.116) 


Differentiating  recurrence  (2.114)  yields  a  recurrence  for  the  derivatives  of  Laguerre  polynomials, 
which  are  used  to  calculate  weights  for  the  Gaussian  quadrature  corresponding  to  weight  function 
w(x)  =  e~x  on  R+  (see  Section  2.4).  Much  like  Hermite  polynomials,  Laguerre  polynomials  are 
not  functions  that  can  be  used  numerically  due  to  improper  scaling.  In  numerical  applications,  we 
instead  use  normalized  Laguerre  functions  £n,  given  by  the  formula 


£n(x)  =  Ln(x )  e  * 


(2.117) 


Laguerre  functions  satisfy  the  same  recurrence  as  Laguerre  polynomials,  recurrence  (2.114),  except 
with  the  initial  conditions 

4(x)  =  ezr,  (2.118) 


£\{x)  =  (1  —  x)e  2 


(2.119) 


The  functions  £q,£\ ,l2,  ■  ■  ■  can  be  numerically  evaluated  using  recurrence  (2.114)  with  initial  condi¬ 
tions  above  in  equations  (2.118)  and  (2.119).  Since  Laguerre  functions  form  an  orthonormal  basis 
in  L2(R+),  for  any  square  integrable  function  /  :  M+  — >  R  there  exist  ao,  aq,  a2,  •  •  •  such  that 


OO 

f(x )  =  ^ak£k(x). 

k= 0 


(2.120) 


Many  numerical  applications  use  a  truncated  version  of  the  sum  in  equation  (2.120)  to  approximate 
the  function  /,  such  that 

n—  1 

f(x)  «  ^2  ak  ik{x).  (2.121) 

k= 0 

To  determine  ao,  ai, . . . ,  an_i,  we  recall  from  Section  2.4  that  the  n-point  Gaussian  quadrature 
corresponding  to  weight  function  w(x)  =  e~x  on  R+  has  nodes  aq,  X2,  ■  ■  ■ ,  xn  equal  to  the  roots  of 
Ln  and  weights  w\,  W2,  ■  ■  ■ ,  wn  given  by 


Wj  = 


Xj  L'n{xjf 


(2.122) 


Applying  this  quadrature  to  products  of  Laguerre  polynomials  we  obtain  the  identity 


Lk(xj)  Li(xj)  =  Skt, 

3= 1 


(2.123) 


for  all  0  <  k,l  <  n  —  1.  After  an  algebraic  manipulation  of  (2.123),  we  obtain  a  similar  expression 
involving  normalized  Laguerre  functions, 


^Wj  £k(xj)  tl(xj)  =  Skt, 
3= 1 


(2.124) 


where 


(2.125) 


The  roots  x±,X2,  ■  ■  ■  ,xn,  which  are  used  in  formulas  (2.122-2.125),  can  be  numerically  calculated 
using  the  method  contained  in  [9].  The  coefficients  in  expansion  (2.121)  are  then  given  by  the 
formula 

n 

®k  =  ^2  Wj  f(xj)  £k(xj).  (2.126) 

3= 1 


For  an  integer  n  >  0,  we  define  the  linear  operator  Ak  :  Rn  — >  Rn, 


A  L  (fix  1),  f(x2),  •  •  • ,  f(xn))  =  (ao,  a!,...,  a„_i) ,  (2.127) 


as  the  Laguerre  function  transform  of  size  n.  Ak  converts  values  of  /  at  the  roots  of  Ln  into 
coefficients  in  a  Laguerre  function  expansion  using  equation  (2.126).  In  matrix  notation, 


where 


al  =  wltl, 


WL 


\ 


V 


WnJ 


(2.128) 


(2.129) 
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(  4(zi) 


tl  = 


£0(xr 


(2.130) 


\^n—  l(^l)  ’’’  ^n—  l(^n)y 

The  application  of  matrix  TjJ  in  equation  (2.130)  will  be  accelerated  by  the  algorithm  described  in 
Section  4. 

Observation  2.12  In  Section  5  we  will  use  the  fact  that  for  integers  n,  v  >  0,  the  normalized 
generalized  Laguerre  functions  £vn  are  the  eigenfunctions  of  the  Fourier-Bessel  transform  of  order  v. 
The  normalized  generalized  Laguerre  functions  are  given  by  the  formula 


C(x)  = 


n\ 


■e  2X2  Lvn{x), 


(v  +  n)\ 

where  L^(x)  is  the  solution  on  M+  to  the  differential  equation  (similar  to  equation  (2.111)) 

d_ 

dx 

given  by  the  formula  (which  is  very  similar  to  that  in  equation  (2.112)) 

1  dn 

L"(x)  =  -ex  x~v  -f-  (e~xxn+u)  . 

n\  /  nf  Arvn  V  / 


xv+1  e~x  —  )  +nxue~xy  =  0, 
dx 


(2.131) 


(2.132) 


(2.133) 


dxr‘ 

The  eigenfunction  relation  is  given  by  the  formula 

roo 

/  xJu(xy)C(x2)dx  =  (-l)nC(y2).  (2.134) 

Jo 

More  general  forms  of  equation  (2.134)  can  be  found  in  [3]  and  [10]. 

2.9  Prolate  spheroidal  wave  functions 

In  this  section  we  summarize  certain  properties  of  prolate  spheroidal  wave  functions  to  be  used  in 
Section  5;  more  detailed  information  can  be  found  in  [19]  and  [24],  For  a  real  c  >  0,  we  define  the 
operator  Fc  :  L2[— 1,1]  — >  L2[—  1, 1]  as 


(Fctp)(x)  = 


Acxt 


(p(t)  dt. 


(2.135) 


'-1 


The  operator  Fc  has  eigenvalues  Ao,  Ai,  A2, . . .  such  that  |An_i|  >  |An|  for  all  n  >  1.  Furthermore, 
for  each  n  =  0,1,2,... 

\n  =  in\K\-  (2.136) 

We  denote  by  the  nth  prolate  spheroidal  wave  function  (PSWF),  which  is  the  eigenfunction  of  Fc 
corresponding  to  the  eigenvalue  An.  The  function  ^  is  also  a  solution  on  [—1,1]  to  the  differential 
equation 

d  ^(1  -  x2)  +  (xn  -  c2x2)y  =  0,  (2.137) 


dx 


dx 


where  0  <  xo  <  Xi  <  X2  <  •  •  •  are  referred  to  as  the  separation  coefficients.  The  functions 
V’q,  V’ij  V’i’  •  •  •  form  an  orthonormal  basis  in  L2[— 1,1]  under  the  usual  inner  product  formula,  i.e. 
for  any  m,  n  >  0 


£ 


^n{x)  dx  =  8n 


(2.138) 
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For  a  given  bandwidth  c  and  properly  chosen  n,  the  functions  . . . ,  V’n-i  are  a  set  of  stable 

interpolation  functions  which  can  be  used  to  accurately  approximate  functions  of  the  form  elcx , 
with  x  £  [—1, 1].  To  rephrase,  given  a  function  /  :  [—1, 1]  — >  C, 

f(x)  =  eicx,  (2.139) 

there  exist  coefficients  ao,  ai, . . . ,  an-i  such  that 

n—  1 

/0r)~5Za  ki’Ux)-  (2.140) 

fc=i 


If  interpolation  nodes  are  chosen  to  be  the  n  roots  of  the  function  the  determination  of 
ao,  ou,  ■  ■  ■ ,  ctn-i  is  a  well-conditioned  procedure.  Using  the  methods  of  [9],  we  can  numerically 
calculate  the  roots  of  any  PSWF  as  well  as  evaluate  any  PSWF  on  the  interval  [—1,1].  Suppose 
now  that  the  coefficients  in  expansion  (2.140)  are  known,  and  one  wishes  to  evaluate  the  expansion 
at  the  n  points  x±,x2,  ■  ■  ■  ,xn  (for  example,  the  n  roots  of  V’n)-  For  any  integer  n  and  complex 
numbers  ao,  «i, . . . ,  an_i,  we  define  the  linear  operator  :  Rn  — >  Rn, 

(a0, «!,...,  an_i)  =  (f(x i),  f{x2),  •  •  • ,  f(xn))  (2.141) 


as  PSWF  interpolation  of  size  n.  In  matrix  notation, 


/%{xi) 

\%(xn) 


Vn-i(xi) 
n~l(xn ) 


(2.142) 


The  algorithm  described  in  Section  4  will  accelerate  the  application  of  the  matrix  E^  in  equa¬ 
tion  (2.142). 


3  Analytical  apparatus 

The  algorithm  of  this  paper  relies  on  the  observation  that  for  certain  n  x  n  matrices  the  numerical 
rank  of  any  m  x  k  contiguous  submatrix  depends  only  on  the  quantity  mk.  This  section  contains 
a  proof  of  this  fact  for  the  discrete  Fourier  transform  and  the  discrete  Fourier-Bessel  transform. 
Inter  alia,  we  first  prove  analogous  results  for  the  continuous  Fourier  transform  and  continuous 
Fourier-Bessel  transform. 

3.1  Rank  analysis  of  the  Fourier  transform 

The  Fourier  transform  T  :  L2(R)  — >  L2( R)  is  given  by  the  formula 

/OO 

eixtf(t)dt.  (3.1) 

-CO 

Let  a,  b,  u ,  v  be  real  numbers  such  that  —  n  <  a  <  b  <  ir  and  u  <  v.  We  wish  to  show  that  the 
operator  S  :  L2[a,b\  — >■  L2[u,v\,  given  by  the  formula 

(Sf)(x)=  [beixtf(t)dt,  (3.2) 

J  a 

has  numerical  rank  that  depends  only  on  the  quantity  (v  —  u)(b  —  a).  First,  we  define  what  we 
mean  by  the  numerical  rank  of  an  integral  operator,  as  well  as  the  numerical  rank  of  a  matrix. 


20 


p 

k 

Error 

Predicted  k 

10 

30 

.99111E-12 

49 

50 

83 

.99726E-12 

95 

100 

142 

.99918E-12 

152 

500 

570 

.99992E-12 

581 

1000 

1088 

.99991E-12 

1100 

5000 

5151 

.99967E-12 

5167 

10000 

10190 

.99991E-12 

10210 

50000 

50324 

.99995E-12 

50357 

Table  2:  Accuracy  of  Chebyshev  expansion  for  complex  exponential. 


Definition  3.1  We  say  that  an  integral  operator  S  :  L2[a,  b]  — >  L2[c,d\  with  kernel  s  :  [c,d\  x 

M-c, 

(Sf)(x)  =  f  s(x,t)  f(t)dt,  (3.3) 

J  a 

has  rank  k  if  s  can  be  written  as 


k 

s{x,t )  =  y^gn(x)hn(t),  (3.4) 

n= 1 

for  some  set  of  functions  gi,h\,  <72,^2, . gk,hk-  Likewise,  S  has  rank  k  to  precision  e  >  0  if  s  can 
be  written  as 

OO 

s(x,t)  =  y^gn(x)  hn(t),  (3.5) 

n=  1 

where  for  every  n  >  k, 

\gn{x)  hn(t)\  <  e  (3.6) 

for  all  t  £  [a,  b]  and  x  £  [c,  d\. 

If  A  is  an  mxn  matrix,  we  say  that  it  has  rank  k  if  we  can  write  the  singular  value  decomposition 
of  A  as 

A  =  UDVt,  (3.7) 

where  U  and  V  are  matrices  of  dimension  m  x  k  and  n  x  k,  respectively,  both  whose  columns  are 
orthonormal  and  D  is  a  k  x  k  diagonal  matrix  with  diagonal  elements  di,  a\  >  02  >  •  •  •  >  >  0. 

We  say  that  A  has  rank  k  to  precision  e  if  ay  >  e  >  <rfc/+1. 

Our  aim  now  is  to  show  that  the  kernel  of  the  operator  in  (3.2),  s(x,  t )  =  elxt ,  can  be  written  in  the 
form  (3.5).  The  following  lemma  is  an  immediate  consequence  of  Formulas  9.1.44  and  9.1.45  in  [1], 

Lemma  3.2  For  any  p  £  M  and  r  £  [—1, 1], 


OO 

eipT  =  Up)  +  2in  Jn(p)  Tn(r).  (3.8) 

n=  1 

Table  2  shows  the  numerical  accuracy  of  a  truncated  version  of  expansion  (3.8),  along  with  the 
predicted  expansion  order  from  Lemma  2.6.  The  column  labeled  “pn  represents  the  same  p  as  in 
equation  (3.8).  The  column  labeled  uk”  gives  the  number  of  terms  needed  in  expansion  (3.8)  to 
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achieve  an  absolute  accuracy  of  at  least  10  12  for  all  t  £  [—1,1].  The  values  in  the  column  labeled 
“error”  are  calculated  as 

k 

max  elpT  -  J0(p)  -  Y'  2  in  Jn(p)  T„(r)  .  (3.9) 

The  value  in  the  column  “predicted  /c”  is  the  smallest  positive  integer  k  such  that 

\2in  Jk(p)Tk(r)\  <  2  |  Jk(p)\  <  10”12.  (3.10) 

This  value  of  k  is  obtained  via  Lemma  2.6.  We  can  now  state  and  prove  the  following  theorem. 

Theorem  3.3  If  a,  b,  u,  v  are  real  numbers  such  that  —ir  <a<b<7T,u<v,  and  (v  —  u){b  —  a)  > 
1,  then  for  any  fixed  positive  e  <  0.1,  the  integral  operator  S  :  L2[a,  b]  —>  L2[u,  v\, 

(■ Sf)(x)=  [beixtf(t)dt,  (3.11) 

J  a 

has  rank  to  precision  e  at  most  a  constant  times  (v  —  u)(b  —  a). 


Proof.  We  first  expand  the  kernel  of  operator  (3.11)  into  the  form  of  (3.5).  We  begin  by  rewrit¬ 
ing  (3.11)  as 


(sf)(x)  =  sm=  f  e^+w&mdt, 

J  a 


where  S  :  L2[a ,  b]  — >  L2[0, 1],  and  £  is  given  by  the  formula 


(3.12) 


(3.13) 


Letting  t  =  -t  +  we  convert  (3.12)  into 


(S/)K)  =  t1  e<(M-<»-*)«X^+!±S)  /(T)  iT 


(3.14) 


If  we  now  define  s  to  be  the  kernel  of  integral  operator  (3.14),  and  let  c  =  (v  —  u)K^,  after  applying 
Lemma  3.2  we  obtain 


s^T)  =  b  ~  Q  ci(«+(t>-u)0(MT+^g) 

=  b~a  e ei(v-u)±pt  eidT  eiu^r 


b  ~  a  ciub4±  ci(v-u)b-±±j 


McO  +  Y.^nJn^)Tn{T)\  e 


JU^T 


»(£,  T)  =  (b  -  a)  eiu^  e^~u^  Jn(c£)  Tn(r)  eiu^ 


and  thus 


j  CXJ 

n— 1 


(3.15) 

(3.16) 


(3.17) 


To  simplify  the  above  expression  for  s,  for  n  >  0  let  the  function  nn  :  [0, 1]  x  [—  1, 1]  — >  C  be  the 
nth  term  in  expansion  (3.17),  i.e. 


(3.18) 


(3.19) 
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If  we  take  the  absolute  value  of  expression  (3.18)  for  nn,  we  arrive  at  the  inequality 


M£,r)|  =  in(b-a)eiub-^e^-u^Jn(^)Tn(r)eiub-^T 

(3.20) 

<  \{b  -  a)  Jn(c£)Tn(T) | 

(3.21) 

<  2p\Jn{cf)\. 

(3.22) 

Now,  if  c 

>  10,  then  by  Lemma  2.6,  \nn\  is  less  than  e  whenever 

n  >  (c1/3  +  a  c_1/3)3, 

(3.23) 

where 

-T-2/s(2;)' 

(3.24) 

If  c  <  10, 

\nn\  is  less  than  e  whenever 

n  >  (101/3  +  al0~1/3)3. 

(3.25) 

It  now  follows  from  equations  (3.23-3.25)  that  if  (v  —  u)(b  —  a)  >  1,  the  rank  to  precision  e  of  the 
integral  operator  S  is  at  most  a  constant  times  (y  —  u)(b  —  a).  □ 

Corollary  3.4  Let  Tp  be  the  n  x  n  matrix  of  the  discrete  Fourier  transform,  as  described  in 
Section  2.3.  If  A  is  any  m  x  k  contiguous  submatrix  of  Tp  such  that  mk  >  n,  then  for  any  fixed 
positive  £  <  0.1,  the  rank  of  A  to  precision  e  is  at  most  a  constant  times  the  quantity  mk. 

3.2  Rank  analysis  of  the  Fourier-Bessel  transform 

For  any  real  number  R  >  0  and  integer  v  >  0,  the  Fourier-Bessel  transform  of  order  v  7iv  : 
L2[0,R]  — ■>  L2(M)  is  given  by  the  formula 

fR 

(hLuf){p)  =  /  \ftJv{pt)  f{t)dt  (3.26) 

Jo 

(see  Section  2.2).  Let  a ,  b ,  u,  v  be  real  numbers  such  that  0  <  a  <  b  <  R  and  u  <  v.  We  wish  to 
show  that  the  operator  Uv  :  L'2[a,b]  — >  L2[u,v],  given  by  the  formula 

(u„f)(j>)=  [R  VtMpt)f(t)dt,  (3.27) 

Jo 

has  numerical  rank  at  most  a  constant  times  (v  —  u)(b  —  a).  We  prove  this  fact  in  Theorem  3.5 
below. 

Theorem  3.5  For  any  fixed  positive  e  >  0.1,  R  >  0,  and  integer  v  >  0,  if  a,  b,  u,  v,  are  real 
numbers  such  that  0  <  a  <  b  <  R,  0  <  u  <  v,  and  (v  —  u)(b  —  a)  >  1,  then  the  integral  operator 
Uu  :  L2[a,b]  — >  L2[u,v\,  given  by  the  formula 

r-b 

0 Uuf)(p)=  /  VtJu(pt)f(t)dt ,  (3.28) 

J  a 

has  rank  to  precision  e  at  most  a  constant  times  (v  —  u)(b  —  a). 
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Proof.  We  prove  Theorem  3.5  in  the  case  where  v  =  2m,  for  some  integer  m  >  0.  The  proof  for 
odd  v  is  virtually  identical,  and  is  omitted.  Using  Lemma  2.11,  we  rewrite  (3.28)  as 


(U2mf)(p) 


Tzmiy) 

V1  -  v2 


cos  (pty)  dy 


fit)  dt. 


Changing  variables  according  to 


t 


b  —  a 
2 


T  + 


a  +  b 
2 


p  =  (v  -  u)£  +  u, 


the  operator  U2m  is  converted  into  U2m  :  L2[—  1, 1]  — >  L2[0, 1],  given  by  the  formula 


(3.29) 


(3.30) 

(3.31) 


(£W)({)  =  j\  \/+L  +  2-±t  (-!)’”  I  - 

f1  T2m(y)  (u  ^  ,  ! 

/  7= - =■  cos  [(v  -  u)£  +  u] 

Jo  vi -y  v 

/(T)  -Y~dr- 


b  —  a 
2 


r  + 


dy- 


(3.32) 


We  now  write  the  kernel  of  [/2m  in  the  form  of  (3.5).  Let  p,2m  :  [0, 1]  x  [—1, 1]  — ►  M  be  the  kernel  of 
U2m,  given  by  the  formula 


P2mit,r)  =am{T)  f  T;n{V\  cos  f  [+  -  u)£  +  u\ 
Jo  \/l-  yz  \ 


V1  -y2 


b  —  a  a  +  b 
^  T  H 


V  dy 


=  amjr )  f1  T2m{y)  ei[(;,-M)g+M][^r+^]i 

2  7-i  y7!  -  2/2 


dy, 


where 


am(r)  =  (~l)r 


b  —  a  b  —  a  a  +  b 


IT 


-T  + 


Letting  c=  (v  —  u)  -+ ,  we  expand  the  kernel  p2m  using  Lemma  3.2  into 
p2m{  s,^)  =  a'm^  f  Tlmjy)  ^iuS+ky  ci(v-u)2±b.£y  cic£Ty 

2  7-i  V1  -  y2 

_  Om(r)  I"1  T2m(y)  ^iua±by  ei(v-u)2±k£y 


eiuKrTy  dy 


'-i  V1  -  r 


do(c£y)  +  ^  2  +  Jn(c£y)  Tn(r )  dy. 


n=l 


(3.33) 

(3.34) 


(3.35) 


(3.36) 


(3.37) 


To  simplify  the  expression  for  /U2m,  for  all  n  >  0  let  cun  :  [0, 1]  x  [—1, 1]  — »  R  be  the  nth  term  in 
expansion  (3.37),  given  by  the  formula 


^ti(£,t)  =  am{r)  ir 


f1  T2m(y ) 
l-i  \J\  -  y2 


eiu^y  ei{v-u)*pzv  Jn(c£y)  Tn (r)  eiu^Ty  dy. 


a+b 


(3.38) 
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Taking  the  absolute  value  of  un,  we  obtain  the  inequality 


lwn(£,T)|  = 


Q!r 


(■ r)in  C  T^y\  eiu^y  Jn(d;y)  Tn(r)  j 

J- 1  \/l  -  v2 


< 


eiu^y  ei(v-u)^y  Jn(cty)Tn(T)ju!^Ty  dy  (3.40) 


a-| -b 


-1  dy 


l-i  \J\-y 

R  fiR  f1  \T-2m(y)\ 

vr  V  2  y_! 


R  3R  It/  \  I 
< —  \/ —  max  Jn(cr) 

“  7T  V  2  re[0,l]  . 

=  R\- max  I Jn(cr) I . 

V  2  re[o,i] 


Now,  if  c  >  10,  then  by  Lemma  2.6,  |wn|  is  less  than  e  whenever 


lub~^Tydy 


i-i 


n>(c1/3  +  «c-1/3)3 


where 


3-V3 


a  = 


log2/3f-l/? 


UV  2 


If  c  <  10,  \u>n\  is  less  than  e  whenever 


n 


>  (101/3  +  al0”1/3)s 


(3.39) 


(3.41) 

(3.42) 

(3.43) 

(3.44) 

(3.45) 


It  now  follows  from  equations  (3.43-3.45)  that  if  (v  —  u)(b  —  a)  >  1,  the  rank  to  precision  e  of  the 
integral  operator  Uu  is  at  most  a  constant  times  (v  —  u)(b  —  a).  □ 


Corollary  3.6  For  fixed  R  >  0,  letTju  be  the  nxn  matrix  of  the  discrete  Fourier- Bessel  transform, 
as  described  in  Section  2.2.  If  A  is  any  mx  k  contiguous  submatrix  ofTjv  such  that  mk  >  n,  then 
for  any  fixed  positive  e  <  0.1,  the  rank  of  A  to  precision  e  is  at  most  a  constant  times  the  quantity 
mk. 


4  The  butterfly  algorithm 

This  section  contains  a  description  of  an  algorithm  for  the  application  of  nxn  matrices  which  have 
the  property  that  any  m  X  k  contiguous  submatrix  has  numerical  rank  that  depends  only  on  the 
quantity  mk. 

4.1  Notation 

In  this  section,  we  establish  notation  that  will  be  used  in  Sections  4. 2-4. 5  when  describing  the 
algorithm. 

A  will  denote  the  matrix  being  decomposed  and  applied,  and  it  is  of  size  n  x  n  where  n  =  2N 
for  some  positive  integer  N. 

The  precision  to  which  all  interpolative  decompositions  are  performed  is  given  by  e. 

The  algorithm  imposes  a  dyadic  hierarchy  of  columns  and  rows  on  the  matrix  A.  Columns  of 
A  are  merged  with  increasing  level  while  rows  of  A  are  split  with  increasing  level.  Levels  0  and  1 
are  shown  in  Figure  1.  If  we  require  that  there  can  be  at  most  Cmax  columns  in  each  submatrix 
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Figure  2:  Level  L  of  A. 


on  level  0  of  the  hierarchy,  then  we  have  L  levels,  where  Cmax 2L~l  <  n  <  Crnax 2 L .  Level  L  of  the 
matrix  is  shown  in  Figure  2. 

On  every  level  l,  0  <  l  <  L,  there  are  a  constant  number  of  submatrices,  2L  ~  — .  We  denote 

by  A[  j  the  submatrix  in  position  (i,j)  on  level  l.  The  index  i  denotes  the  row  of  submatrices 
and  j  denotes  the  column  of  submatrices.  For  a  given  level  l,  i  ranges  between  1  and  2l  and  j 
ranges  between  1  and  2L~l .  The  contiguous  submatrix  A[  •  consists  of  the  intersection  of  rows 
( i  —  1)2^^^  +  1  through  i2N~l  and  columns  (j  —  l)2N~L~l  +  1  through  j2N~L~l  of  A.  Level  1  is 
given  by 


A  = 


A l,i 


A1 

^1,2 

A 1 
^2,2 


(4.1) 


and  level  L  is  given  by 


(  Aitl  \ 

Al 

^2,1 

^2^-1, 1 
\  ^2L ,  1  / 


(4.2) 


We  denote  the  rank  (to  precision  e)  of  A\  j  by  k\  j .  The  matrix  B[  ^  will  be  some  selection  of 
k\  j  columns  of  A\j  that  allow  for  a  stable  interpolative  decomposition  of  the  form  A\j  =  B\  •  P, 
for  some  matrix  P  (see  Section  2.1).  The  matrix  B1-  rj  will  be  referred  to  as  the  column  skeleton 
matrix  of  A\-,  and  P  will  be  referred  to  as  an  interpolation  matrix. 
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For  a  matrix  X  dimensioned  2m  x  k,  we  denote  by  X+  the  mx  k  matrix  obtained  from  the  top 
m  rows  of  X.  Likewise,  X~  is  used  to  denote  the  m  x  k  matrix  obtained  from  the  bottom  m  rows 
of  X.  Thus, 

X  =  (£)  .  (4.3) 

If  we  are  to  use  the  algorithm  described  in  Sections  4. 2-4. 5  to  compute  y  =  Ax,  then  for 
j  =  1,2,...,  2^  we  denote  by  x j  the  column  vector  of  length  2N~L  obtained  from  elements  (j  — 
1)2n~l  +  1  through  j2N~L  of  x.  Analogously,  for  i  =  1,2, . . .  ,2L  we  denote  by  yf  the  column 
vector  of  length  2N~L  obtained  from  elements  (i  —  1)2N~L  +  1  through  i2N~L  of  y. 


4.2  Informal  description 

Given  the  matrix  A,  the  general  strategy  of  the  algorithm  is  to  form  an  interpolative  decomposition 
for  every  matrix  on  every  level  using  previous  interpolative  decompositions  that  were  created  on 
previous  levels.  It  terminates  with  the  application  of  a  decomposed  version  of  the  matrix  A  to 
an  arbitrary  vector  x.  We  first  describe  a  two  level  algorithm,  and  then  generalize  to  a  multilevel 
algorithm. 

If  we  let  Cmax  =  7j,  then  there  exists  two  levels  of  A.  Level  0  is  given  by 

•4  =  Ki  A°2)  ,  (4.4) 

where  A®  ,  and  A®  2  are  both  nx  ^  matrices.  Level  1  is  given  by 


A  = 


(4.5) 


where  A\  1  and  A\  1  are  both  |  x  n  matrices.  Using  the  interpolative  decomposition  described  in 
Section  2.1,  we  can  write  A®  1  and  A®  2  as 


A°{  ,  =  5? ,  P,° 


1,1  ^i,i) 


(4.6) 


4° 

^1,2 


—  R°  P° 

~  nl,2  ^1,2) 


(4.7) 

where  B®  1  is  a  selection  of  k\  1  columns  of  A®  1,  B®  2  is  a  selection  of  h®  2  columns  of  A®  2,  and  P,°  l 
and  P\  2  contain  k\  l  X  1  and  k®  2  x  k®  2  identity  matrices,  respectively.  Using  decompositions  (4.6) 
and  (4.7),  the  matrix  A  is  now  given  by  the  formula 


-4  =  K  i 


(4.8) 


Decomposition  (4.8)  expresses  A  through  only  k®  1  +  k®  2  of  its  columns.  We  now  form  decomposi¬ 
tions  of  A\  j  and  A\  1  similar  to  those  in  (4.6)  and  (4.7). 

Since  A®  1  and  A®  2  of  level  0  can  be  represented  using  only  the  columns  chosen  for  B®  1  and 
B®  2,  respectively,  it  is  possible  to  choose  columns  of  B®  1  and  B\] 2  to  represent  matrices  A\  x  and 
A\  1  of  level  1.  Thus,  we  can  form  the  interpolative  decompositions 


and 


K,  ^J/2)+  =  B\a  P,1 ,  (4.9) 

Ki  B^2y  =  B\AP}2A,  (4.10) 
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where  B\  j  is  a  selection  of  k\  1  columns  of  A\  1  which  were  chosen  from  those  used  to  form  P?  1 
and  P?  2,  B2  i  is  a  selection  of  k\  x  columns  of  A\  1  which  were  chosen  from  those  used  to  form  P?  1 
and  P?  2,  and  Ph  and  P2l  contain  k\  x  x  k\  x  and  k\  1  x  k\  1  identity  matrices,  respectively.  The 
submatrices  A\  1  and  A 2  1  now  have  the  decompositions 


41  _  pi 

~  Di,i  -*1,1 


41  —  pi  pi 

^2,1  ~  n2,l  *2,1 


The  final  decomposition  of  the  matrix  A  is 


a  =  iBh 


0 

^2,1 


1 

0 

p° 

-*1,1 

0 


p1 

^1,1 

pi 

■*2,1 


0 

Ph 

0 

pO 

-*1,2 


p° 

-*1,1 

0 


p° 

*1  ,2 


(4.11) 

(4.12) 


(4.13) 


An  examination  of  decomposition  (4.13)  shows  how  an  arbitrary  vector  x  may  be  applied  to  the 
matrix  A. 

We  have  described  a  two  level  algorithm  for  the  decomposition  and  application  of  the  matrix 
A.  If  n  and  Cmax  are  such  that  the  highest  level  is  L  >  1,  we  proceed  as  follows.  Level  0  of  A  is 
then  given  by 

A=(^A\l  A?2  ■■■  A?2i^  ,  (4.14) 

where  41?  1;  A?  2,  •  •  • ,  A®  L  are  all  n  x  2N~L  matrices  (since  n  =  2N ,  for  some  N ).  First,  in- 
terpolative  decompositions  are  formed  for  these  submatrices  on  level  0,  creating  the  matrices 
^1,1,  -Pgi,  ■  •  • ,  P12l  •  Next,  for  3  =  l,2,...,2i_1  the  pair  of  matrices  P?2j_  1  and  P?  2j  is 

used  to  create  an  interpolative  decomposition  for  A\  ■  and  A\-,  creating  matrices  B\  ■  ,  Pj1  ■  and 
B^j,  P2j ,  respectively.  This  procedure  is  analogous  to  creating  decompositions  (4.11)  and  (4.12) 
in  the  two  level  algorithm  we  described  earlier  in  this  section. 

We  continue  forming  interpolative  decompositions  on  successively  higher  levels  by  combining 
column  skeleton  matrices  which  are  adjacent  to  each  other  on  previous  levels.  To  be  more  precise, 
for  l  >  0,  i  =  1, 2, . . . ,  2l,  and  j  =  1,2,...,  2L~l  we  use  the  matrices  Bl~} ,  .  „ .  ,  and  Bl~} ,  .  „ .  to 

create  an  interpolative  decomposition  for  A(  ■ ,  which  results  in  the  creation  of  column  skeleton 
matrix  B\  ■  and  interpolation  matrix  PA.  On  level  L,  the  decomposition  of  Afj  is  given  by  the 
formula 


( PL 72 


'  pL—  1 

Al  =nL  pL  I  L^J’1 

1  1  -M,  1  \  pL—1 


L^L2, 


L^Jd 


(ph 


ph 


pO 

1  1,2l-1 


piV/ 


P 


L- 2 


J+l 


(4.15) 


for  i  =  1,2,...,  2L .  For  L  >  1,  the  decomposition  of  the  whole  matrix  A  is  too  complicated  to  give 
here,  but  an  examination  of  (4.15)  shows  how  the  matrix  A  can  be  applied  to  a  vector  x  using  the 
individual  decompositions  of  A\J  j ,  A21, . . . ,  A^L  . 
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Remark  4.1  Since  for  all  l ,  i,  and  j  the  ranks  of  A[  ■  are  virtually  the  same,  all  the  interpolation 
matrices  P-j  for  l  >  0  are  of  bounded  size,  k  x  2  k,  where  k  is  the  rank  of  any  submatrix  on  any 
level.  The  algorithm  is  able  to  apply  A  to  an  arbitrary  vector  x  in  0(n  logra)  operations  due  to 
the  fact  that  all  of  the  interpolation  matrices  are  of  the  same  size. 

Observation  4.2  The  algorithm  described  in  Section  4.2  will  exhibit  the  same  performance  when 
applied  to  the  matrix  A t  since  all  submatrix  rank  considerations  remain  the  same  under  the  trans¬ 
pose  operation. 

4.3  Detailed  description 

This  section  contains  a  detailed  description  of  the  algorithm  that  was  described  informally  in 
Section  4.2  for  the  decomposition  and  application  of  the  matrix  A. 

Step  1 

Choose  main  parameters  and  create  dyadic  hierarchy 

1.  Set  precision  e  for  interpolative  decompositions. 

2.  Set  submatrix  size  parameter  Cmax. 

3.  Calculate  the  number  of  levels  L  such  that  Cmax 2L_1  <  n  <  Cmax2L. 

Comment  [Create  the  dyadic  hierarchy  of  columns  and  rows.  On  each  of  the  L  +  1  levels  of  the 
hierarchy,  there  will  be  2L  submatrices.  Retain  structure  created  for  use  in  precomputation.] 
do  l  =  0, 1, . . . ,  L 
do  i  =  1,2,..., 2l 

doj  =  l,2,...,2i-/ 

Denote  the  submatrix  consisting  of  rows  (i  —  1)2^“^  +  1  through  i 2N~l  and 
columns  (j  —  \)2N~L~l  +  1  through  j2N~L~l  as  A\  ■. 

end  do 
end  do 
end  do 


Step  2 


Precomputation 

Comment  [During  this  stage,  all  submatrices  on  all  levels  are  compressed  using  the  interpolative 
decomposition  described  in  Section  2.1.] 
do  l  =  0, 1, . . . ,  L 
do  i  =  1,2,...,  2l 

doj  =  l,2,...,2i-/ 
if  l  =  0  then 

1.  Form  the  interpolative  decomposition 

<i  =  £i0jA0r  (4-16) 


2.  Store  Pfh . 
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if  (Z  >  0  and  i  is  odd)  then 

1.  Form  the  interpolative  decomposition 


yl- 1 


=  Bl  Pl 


2.  Store  P/.-. 

LiJ 

3.  if  l  =  L  then  store  B\ 

LiJ 

if  (l  >  0  and  i  is  even)  then 

1.  Form  the  interpolative  decomposition 


l  —  l  r>l—l 


=  Bl  Pl 

i,j  i,j 


(4.17) 


(4.18) 


2.  Store  P-j. 

3.  if  l  =  L  then  store  B\ 

end  do 
end  do 
end  do 

Comment  [As  noted  above,  the  matrices  B\  ■  only  need  to  be  stored  in  memory  for  l  =  L.  Only 
these  column  skeleton  matrices  are  used  in  the  application  of  A  in  Step  3.  For  l  <  L ,  B\ ;  can  be 
computed  on  the  fly  by  only  using  column  and  row  numbers  of  the  original  matrix  A.] 

Step  3 


Application 

Comment  [During  application,  the  matrix  A  that  was  decomposed  during  precomputation  in  Step 
2  is  applied  to  an  input  vector  x  to  generate  an  output  vector  y.\ 
do  l  =  0, 1, . . . ,  L 
do  i  =  1,2,..., 2l 

doj  =  l,2,...,2i-/ 

if  l  =  0  then  calculate  and  store 


ii°  =  rL 

1 1  j  1 1  ,j  xj  • 


if  l  >  0  then  calculate  and  store 


u  ■  =  P  ■ 

» j  *  j 


u 


.i-i 


J-l 


if  l  =  L  then  calculate  and  store 


Vi  =  Bh  ui.\- 


(4.19) 


(4.20) 


(4.21) 
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end  do 
end  do 
end  do 

Comment  [The  vectors  , . . . ,  y^L  can  then  be  concatenated  to  form  the  vector  y  ~  Ax,  which  is 
accurate  to  relative  precision  e.] 

4.4  Complexity 

This  section  contains  an  operation  count  for  applying  the  algorithm  described  in  Section  4.3  to  the 
matrix  A.  A  brief  analysis  of  the  algorithmic  complexity  is  given  in  Table  3. 

Remark  4.3  We  claim  in  Table  3  that  during  Step  2,  each  of  the  interpolative  decompositions 
performed  can  be  computed  in  0(n )  operations.  Recalling  Observation  2.3,  the  interpolative  de¬ 
composition  of  an  m  x  l  matrix  of  rank  k  can  be  performed  in  0(kml  log^)  operations.  Since  the 
ranks  of  all  submatrices  on  all  levels  of  A  are  virtually  the  same,  k,  we  have  that  each  interpolative 
decomposition  can  be  computed  in  0(2k2  log2fc)  =  0(n)  operations,  where  d  is  the  level  of  the 
matrix  being  decomposed. 

Observation  4.4  If  the  matrix  A  is  non-singular  to  precision  e  (as  is  the  case  for  orthogonal 
polynomial  transform  matrices) ,  we  can  avoid  the  precomputation  of  level  0  since  the  interpolative 
decomposition  of  matrices  on  that  level  will  yield  P[h  =  Iko  ,  where  Ik  denotes  a  k  x  k  identity 

matrix  and  j  =  1,2,...,  2L . 


Step 

Purpose 

Operation  count 

Explanation 

1 

Initialization 

0(?i  log  n) 

No  numerical  computations  are  performed,  but  the 
dyadic  hierarchy  is  created  and  parameters  are  set. 

2 

Precomputation 

compressions 

0{n 2  logn) 

On  each  of  the  O(logn)  levels,  0(n)  interpolative 
decompositions  must  be  performed.  Each  decompo¬ 
sition  can  be  performed  in  0{n)  operations. 

3 

Application  of 
the  transform 

to  a  vector 

0(n  log  n) 

On  each  of  the  O(logn)  levels,  0(n)  interpolation 
P  matrices  must  be  applied,  each  of  bounded  size. 
Then,  on  level  L  an  additional  0(n)  column  skeleton 

B  matrices  must  be  applied,  each  of  bounded  size. 

Table  3:  Complexity  of  the  algorithm. 


Another  important  characteristic  of  any  numerical  algorithm  is  the  storage  requirement.  Table  4 
gives  an  overview  of  the  storage  requirements  for  our  algorithm.  We  ignore  requirements  for  work 
arrays,  and  assume  that  we  do  not  need  to  store  the  matrix  A  in  memory,  but  that  its  elements 
can  be  computed  when  needed. 


Observation  4.5  For  l  >  0,  i  =  1, 2, . . . ,  2‘,  and  j  =  1, 2, . . . ,  2L  1 ,  only  the  vectors  u\  ^ 


.i-i 


and  u 


i- 1 


L^h2j 


I4±±j,2j  — 1 

.  are  used  to  compute  u\  ■  in  Step  3.  Thus,  the  storage  requirement  for  the  application 


of  matrix  A  in  Step  3  can  be  relaxed  to  O(n). 
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Step 


Purpose 


Storage  needed 


Explanation 


T  .  .  .  /n,  ,  \  Only  the  dyadic  hierarchy  and  parameters  must  be 

Initialization  0(n  logn)  , 

stored. 

On  each  of  the  O(logn)  levels,  0{n)  interpolation 
Precomputation  „ ,  .  ,  P  matrices  must  be  stored,  each  of  bounded  size, 

compressions  On  level  L,  an  additional  0(n )  column  skeleton  B 

matrices  must  be  stored,  each  of  bounded  size. 


Application  of 
3  the  transform 
to  a  vector 


0(n  logn) 


On  each  of  the  O(logn)  levels,  we  store  the  vectors 
u\y  This  requires  0{n )  memory  for  each  level.  On 
level  L,  an  additional  0(n)  amount  of  storage  is 
required  to  calculated  and  store  the  vectors  y\. 


Table  4:  Storage  requirements  of  the  algorithm. 


Figure  3:  One  possible  adaptive  level  1  of  A. 

4.5  An  adaptive  algorithm 

The  algorithm  described  in  Sections  4.2  and  4.4  is  based  on  the  assumption  that  the  rank  of  two 
different  contiguous  submatrices  is  the  same  if  they  have  an  equal  number  of  elements.  In  practice, 
it  is  possible  for  this  rank  to  fluctuate.  To  avoid  calculating  an  interpolative  decomposition  for  a 
submatrix  whose  rank  is  too  large,  one  can  implement  an  adaptive  regime  to  dyadically  partition 
the  rows  as  much  as  needed  on  a  given  level.  This  results  in  accelerated  matrix  application  times 
(at  the  cost  of  possibly  longer  precomputation  times). 

To  implement  this  optimization,  we  require  that  the  maximum  allowable  rank  of  any  submatrix 
on  any  level  be  at  most  some  constant  kmax.  Previously,  it  was  only  possible  to  have  levels  of  the 
matrix  that  looked  like  those  in  Figures  1  and  2.  Now,  by  insisting  that  the  maximum  allowable 
rank  per  submatrix  is  at  most  kmax ,  it  is  possible  to  have  levels  resembling  that  in  Figure  3.  The 
columns  of  the  matrix  in  Figure  3  are  still  dyadically  partitioned,  but  the  rows  have  been  adaptively 
dyadically  partitioned,  depending  on  the  actual  rank  of  submatrices.  As  the  level  increases,  the 
merging  of  columns  and  subdivision  of  rows  continues  appropriately  with  respect  to  the  adaptive 
hierarchy. 

Observation  4.6  With  the  proper  choice  of  kmax,  we  have  yet  to  find  a  case  in  which  the  adaptive 
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algorithm  applies  a  matrix  slower  than  the  non-adaptive  algorithm.  It  has  not  been  determined 
whether  this  is  due  to  an  actual  decrease  in  the  complexity  of  the  algorithm  or  issues  related  to 
on-board  caching. 

5  Numerical  examples 

The  butterfly  algorithm  has  been  implemented  in  FORTRAN,  and  this  section  reports  numerical 
results  obtained  with  our  implementation.  In  all  cases,  we  used  the  Lahey/Fujitsu  compiler  with 
optimization  flag  — o2,  and  all  examples  were  run  on  a  2.4  GHz  Intel  Pentium  4  processor  with  1 
GB  of  RAM  and  an  L2  cache  of  512  KB.  CPU  times  listed  are  in  seconds  and  all  operations  were 
performed  using  double  precision  arithmetic.  All  examples  (except  where  otherwise  noted)  reflect 
the  adaptive  version  of  the  algorithm  as  discussed  in  Section  4.5. 

The  columns  labeled  “n”  list  the  size  of  the  matrices  to  which  the  algorithm  of  this  paper  was 
applied.  All  matrices  in  these  examples  are  square. 

The  columns  labeled  “precomputation”  list  the  times  taken  to  decompose  the  said  n  x  n  matrix. 
The  columns  labeled  “direct  eval”  list  the  times  taken  for  a  direct  n  x  n  matrix-vector  multi¬ 
plication.  Times  in  parenthesis  are  estimates. 

The  columns  labeled  “fast  eval”  list  the  times  taken  to  apply  the  nxn  matrix  using  the  algorithm 
of  this  paper. 

The  columns  labeled  “/2  error”  contain  the  relative  errors  between  the  solution  obtained  via  the 
algorithm  and  the  solution  obtained  using  a  direct  matrix-vector  multiplication. 

The  columns  labeled  “MB  used”  list  the  amount  of  memory  in  megabytes  required  by  the 
algorithm  for  precomputation  and  evaluation. 

In  each  example,  the  matrix  was  applied  to  a  normalized  vector  containing  random  numbers 
uniformly  distributed  on  the  interval  [0, 1].  The  value  of  kmax  was  chosen  so  as  to  produce  optimal 
run-times  for  size  n  =  2048.  To  find  this  value  of  kmax ,  the  adaptive  algorithm  was  run  in  each  ex¬ 
ample  for  size  n  =  2048  and  kmax  =  30,  32,  34, ...  ,  138, 140;  the  value  that  gave  the  best  evaluation 
time  was  chosen.  The  value  of  kmax  used  is  given  below  each  table,  and  s  =  1CU10  unless  otherwise 
stated.  No  effort  was  made  to  optimize  the  precomputation  required.  The  precomputation  scheme 
used  costs  (D(n2  logn),  as  described  in  Section  4.4. 

5.1  Fourier-Bessel  transforms 

Tables  5-9  document  results  of  applying  our  algorithm  to  Fourier-Bessel  transform  matrices  Tjv 
in  equation  (2.27)  with  v  =  0, 1, 100,  and  10000.  Table  10  contains  data  for  varying  e  in  the  case 
of  u  =  0  and  n  =  8192.  In  all  examples  R  =  1,  where  R  is  such  that  p\ .  p-2-  P'.h  ■  ■  ■  are  the  real 
increasing  positive  zeros  of  the  function  g  :  M+  — >  M, 

g(p)  =  Ju(2npR)  (5.1) 

(see  equations  (2.15-2.27)).  The  data  listed  in  Tables  5-9  are  illustrated  in  Figures  4-10,  respec¬ 
tively. 

Observation  5.1  An  examination  of  the  data  contained  in  Tables  5-9  shows  that  the  algorithm 
of  this  paper  has  consistent  performance  across  all  orders  of  transforms.  The  “fast  eval”  times 
are  virtually  independent  of  the  order  of  the  Fourier-Bessel  transform  matrix  being  applied.  This 
observation  is  illustrated  in  Table  9,  where  for  a  matrix  of  size  n  we  choose  v  =  n.  For  large  values 
of  n,  the  “fast  eval”  times  increase  almost  linearly.  Figure  9  illustrates  data  from  Tables  5-9  in  one 
plot. 
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Tables  11  and  12  contain  results  from  applying  the  algorithm  to  the  matrices  Ej  in  equa¬ 
tion  (2.30)  and  Eh  in  equation  (2.31),  respectively.  In  both  matrices  Ej  and  Eh,  x\,  X2,  ■  ■  ■ ,  xn 
are  given  by 

27T 

Zj  =  n  + — (j  -  i).  (5.2) 

As  a  consequence  of  Lemma  2.6,  choosing  xi,X2,  ■  ■  ■  ,xn  as  in  equation  (5.2)  ensures  that  all  com¬ 
puted  values  of  Bessel  functions  and  Hankel  functions  are  properly  scaled. 

Remark  5.2  Tables  11  and  12  reflect  results  from  applying  the  algorithm  to  the  matrices  Ej 
and  Ejj,  respectively.  This  was  done  to  greatly  reduce  extended  precomputation  times  due  to 
the  calculation  of  Bessel  and  Hankel  functions  using  their  recurrence  relations.  As  mentioned  in 
Observation  4.2,  applying  the  algorithm  to  Ej  and  E*H  yields  results  similar  to  those  which  would 
have  been  obtained  from  applying  the  algorithm  to  Ej  and  Eh- 
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n 

Precomputation 

Direct  eval 

Fast  eval 

lz  error 

MB  used 

256 

.32E+00 

.24E-03 

.22E-03 

.93E-12 

.45E+00 

512 

.86E+00 

.12E-02 

.88E-03 

.23E-11 

.13E+01 

1024 

.34E+01 

.50E-02 

.24E-02 

.15E-11 

.36E+01 

2048 

.15E+02 

.20E-01 

.64E-02 

.29E-11 

.93E+01 

4096 

.70E+02 

.81E-01 

.16E-01 

.21E-11 

.23E+02 

8192 

.32E+03 

.32E+00 

.38E-01 

.23E-11 

.56E+02 

16384 

.15E+04 

(.13E+01) 

.91E-01 

.22E-11 

.13E+03 

32768 

.65E+04 

(.52E+01) 

.21E+00 

.23E-11 

.31E+03 

65536 

.28E+05 

(.21E+02) 

.50E+00 

.11E-11 

.72E+03 

Table  5:  Times  and  errors  for  Fourier-Bessel  transform  with  v  =  0  and  kmax  =  70. 


n 


Figure  4:  Comparison  of  the  accelerated  Fourier-Bessel  transform  of  order  0  and  its  direct  calcula¬ 
tion. 
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n 

Precomputation 

Direct  eval 

Fast  eval 

lz  error 

MB  used 

256 

.35E+00 

.24E-03 

.21E-03 

.26E-11 

.45E+00 

512 

.92E+00 

.12E-02 

.88E-03 

.31E-11 

.13E+01 

1024 

.35E+01 

.50E-02 

.24E-02 

.21E-11 

.36E+01 

2048 

.15E+02 

.20E-01 

.64E-02 

.23E-11 

.92E+01 

4096 

.72E+02 

.81E-01 

.16E-01 

.20E-11 

.23E+02 

8192 

.32E+03 

.32E+00 

.38E-01 

.19E-11 

.56E+02 

16384 

.14E+04 

(.13E+01) 

.90E-01 

.30E-11 

.13E+03 

32768 

.65E+04 

(.52E+01) 

.21E+00 

.25E-11 

.31E+03 

65536 

.29E+05 

(.21E+02) 

.50E+00 

.21E-11 

.72E+03 

Table  6:  Times  and  errors  for  Fourier-Bessel  transform  with  v  =  1  and  kmax  =  72. 


Figure  5:  Comparison  of  the  accelerated  Fourier-Bessel  transform  of  order  1  and  its  direct  calcula¬ 
tion. 
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n 

Precomputation 

Direct  eval 

Fast  eval 

l2  error 

MB  used 

256 

.36E+00 

.24E-03 

.18E-03 

.47E-11 

.42E+00 

512 

.11E+01 

.12E-02 

.84E-03 

.72E-11 

.13E+01 

1024 

.43E+01 

.50E-02 

.24E-02 

.11E-10 

.35E+01 

2048 

.19E+02 

.20E-01 

.62E-02 

.12E-10 

.91E+01 

4096 

.88E+02 

.81E-01 

.16E-01 

.14E-10 

.23E+02 

8192 

.40E+03 

.32E+00 

.38E-01 

.11E-10 

.56E+02 

16384 

.18E+04 

(.13E+01) 

.91E-01 

.97E-11 

.13E+03 

32768 

.77E+04 

(.52E+01) 

.21E+00 

.12E-10 

.31E+03 

65536 

.34E+05 

(.21E+02) 

.50E+00 

.12E-10 

.73E+03 

Table  7:  Times  and  errors  for  Fourier-Bessel  transform  with  v  =  100  and  kmax  =  74. 


Figure  6:  Comparison  of  the  accelerated  Fourier-Bessel  transform  of  order  100  and  its  direct  cal¬ 
culation. 
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n 

Precomputation 

Direct  eval 

Fast  eval 

lz  error 

MB  used 

256 

.22E+00 

.24E-03 

.54E-04 

.22E-11 

.15E+00 

512 

.64E+00 

.12E-02 

.32E-03 

.46E-11 

.55E+00 

1024 

.25E+01 

.50E-02 

.  14E-02 

.14E-10 

.21E+01 

2048 

.12E+02 

.20E-01 

.45E-02 

.22E-10 

.65E+01 

4096 

.62E+02 

.81E-01 

.13E-01 

.29E-10 

.19E+02 

8192 

.31E+03 

.32E+00 

.34E-01 

.26E-10 

.50E+02 

16384 

.15E+04 

(.13E+01) 

.84E-01 

.31E-10 

.12E+03 

32768 

.71E+04 

(.52E+01) 

.20E+00 

.32E-10 

.30E+03 

65536 

.32E+05 

(.21E+02) 

.48E+00 

.29E-10 

.71E+03 

Table  8:  Times  and  errors  for  Fourier-Bessel  transform  with  v  =  10000  and  kmaX  =  80. 


n 


Figure  7:  Comparison  of  the  accelerated  Fourier-Bessel  transform  of  order  10000  and  its  direct 
calculation. 
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n 

Precomputation 

Direct  eval 

Fast  eval 

l2  error 

MB  used 

256 

.36E+00 

.24E-03 

.16E-03 

.76E-11 

.40E+00 

512 

.97E+00 

.12E-02 

.75E-03 

.14E-10 

.11E+01 

1024 

.39E+01 

.50E-02 

.21E-02 

.17E-10 

.31E+01 

2048 

.17E+02 

.20E-01 

.57E-02 

.21E-10 

.85E+01 

4096 

.77E+02 

.81E-01 

.15E-01 

.25E-10 

.22E+02 

8192 

.35E+03 

.32E+00 

.35E-01 

.27E-10 

.52E+02 

16384 

.16E+04 

(.13E+01) 

.85E-01 

.34E-10 

.13E+03 

32768 

.68E+04 

(.52E+01) 

.19E+00 

.33E-10 

.29E+03 

65536 

.29E+05 

(.21E+02) 

.46E+00 

.38E-10 

.69E+03 

Table  9:  Times  and  errors  for  Fourier-Bessel  transform  with  v  =  n  and  kmax  =  94. 


n 


Figure  8:  Comparison  of  the  accelerated  Fourier-Bessel  transform  of  order  n  and  its  direct  calcula¬ 
tion. 
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10 


Figure  9:  Comparison  of  accelerated  Fourier-Bessel  transforms  of  many  orders  and  their  direct 
calculation. 
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N 

e 

Precomputation 

Direct  eval 

Fast  eval 

l2  error 

MB  used 

8192 

10-3 

.24E+03 

.32E+00 

.25E-01 

.19E-04 

.35E+02 

8192 

10-4 

.26E+03 

.32E+00 

.27E-01 

.21E-05 

.38E+02 

8192 

10-5 

.27E+03 

.32E+00 

.29E-01 

.19E-06 

.41E+02 

8192 

10-6 

.28E+03 

.32E+00 

.31E-01 

.20E-07 

.44E+02 

8192 

10-7 

.30E+03 

.32E+00 

.33E-01 

.24E-08 

.47E+02 

8192 

10"8 

.31E+03 

.32E+00 

.35E-01 

.22E-09 

.50E+02 

8192 

10"9 

.33E+03 

.32E+00 

.36E-01 

.17E-10 

.53E+02 

8192 

lO-io 

.34E+03 

.32E+00 

.38E-01 

.23E-11 

.56E+02 

8192 

10_n 

.35E+03 

.32E+00 

.41E-01 

.22E-12 

.60E+02 

8192 

10-12 

.37E+03 

.32E+00 

.57E-01 

.16E-13 

.80E+02 

Table  10:  Times  and  errors  for  Fourier-Bessel  transform  with  v  =  0,  kmax  =  70,  and  e  = 
{10-3, . . . ,  10~12}. 


Figure  10:  Comparison  of  the  accelerated  Fourier-Bessel  transforms  of  order  0  for  varying  precision. 
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n 

Precomputation 

Direct  eval 

Fast  eval 

l2  error 

MB  used 

256 

.35E+00 

.24E-03 

.75E-04 

.41E-10 

.22E+00 

512 

.67E+00 

.12E-02 

.26E-03 

.54E-10 

.52E+00 

1024 

.20E+01 

.50E-02 

.83E-03 

.51E-10 

.13E+01 

2048 

.75E+01 

.20E-01 

.19E-02 

.44E-10 

.28E+01 

4096 

.31E+02 

.81E-01 

.44E-02 

.45E-10 

.67E+01 

8192 

.13E+03 

.32E+00 

.95E-02 

.49E-10 

.15E+02 

16384 

.59E+03 

(.13E+01) 

.21E-01 

.52E-10 

.34E+02 

32768 

.26E+04 

(.52E+01) 

.46E-01 

.66E-10 

.79E+02 

65536 

.12E+05 

(.21E+02) 

.lOE+OO 

.73E-10 

.20E+03 

Table  11:  Times  and  errors  for  evaluating  Bessel  function  expansions  with  kmax  =  72. 


n 


Figure  11:  Comparison  of  the  accelerated  evaluation  of  Bessel  function  expansions  and  their  direct 
evaluation. 
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n 

Precomputation 

Direct  eval 

Fast  eval 

l2  error 

MB  used 

256 

.36E+00 

.67E-03 

.11E-03 

.25E-10 

.24E+00 

512 

.76E+00 

.27E-02 

.34E-03 

.86E-10 

.54E+00 

1024 

.20E+01 

.llE-01 

.93E-03 

.43E-10 

.13E+01 

2048 

.62E+01 

.43E-01 

.19E-02 

.43E-10 

.27E+01 

4096 

.22E+02 

.18E+00 

.44E-02 

.51E-10 

.64E+01 

8192 

.94E+02 

(.72E+00) 

.92E-02 

.49E-10 

.15E+02 

16384 

.39E+03 

(.29E+01) 

.20E-01 

.52E-10 

.37E+02 

32768 

.17E+04 

(.12E+02) 

.42E-01 

.56E-10 

.10E+03 

65536 

.77E+04 

(.48E+02) 

.89E-01 

.55E-10 

.31E+03 

Table  12:  Times  and  errors  for  evaluating  Hankel  function  expansions  with  kmax  =  38. 


n 


Figure  12:  Comparison  of  the  the  accelerated  evaluation  of  Hankel  function  expansions  and  their 
direct  evaluation. 
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5.2  Fourier  transforms 


Tables  13  and  14  document  results  obtained  from  applying  the  algorithm  of  this  paper  to  the  matrix 
Tp  in  equation  (2.45).  Table  13  contains  results  from  applying  our  algorithm  to  the  standard 
equispaced  discrete  Fourier  transform  matrix  Tp.  Table  14  contains  results  from  applying  the 
algorithm  to  the  matrix  Tp  where  X\  <  X2  <  •  •  •  <  xn  are  random  numbers  uniformly  distributed 
on  the  interval  [— ir,  7 r]  and  <  0J2  <  •  •  •  <  con  are  random  numbers  uniformly  distributed  on 
the  interval  [—§,§]•  The  data  listed  in  Tables  13  and  14  are  illustrated  in  Figures  13  and  14, 
respectively. 

Observation  5.3  As  a  “numerical  proof”  of  Corollary  3.4,  for  n  =  16384  we  examined  the  nu¬ 
merical  rank  of  each  contiguous  submatrix  of  the  discrete  Fourier  transform  matrix  Tp  which  was 
decomposed  using  the  non-adaptive  regime  of  the  algorithm.  For  Cmax  =  64,  the  numerical  ranks 
ranged  from  85  to  88  (except  on  the  finest  level  where  matrices  were  dimensioned  64  xn).  This  ob¬ 
servation  is  consistent  with  Corollary  3.4;  numerical  ranks  of  m  x  k  contiguous  submatrices  depend 
only  on  the  quantity  mk.  For  large  values  of  n,  the  “fast  eval”  times  increase  almost  linearly. 
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n 

Precomputation 

Direct  eval 

Fast  eval 

l2  error 

MB  used 

256 

.32E+00 

.67E-03 

.65E-03 

.34E-11 

.95E+00 

512 

.13E+01 

.27E-02 

.18E-02 

.91E-11 

.26E+01 

1024 

.60E+01 

.11E-01 

.48E-02 

.11E-10 

.67E+01 

2048 

.28E+02 

.43E-01 

.12E-01 

.12E-10 

.16E+02 

4096 

.13E+03 

.18E+00 

.28E-01 

.15E-10 

.39E+02 

8192 

.56E+03 

(.72E+00) 

.65E-01 

.21E-10 

.90E+02 

16384 

.26E+04 

(.29E+01) 

.15E+00 

.24E-10 

.20E+03 

32768 

.12E+05 

(.12E+02) 

.34E+00 

.23E-10 

.46E+03 

Table  13:  Times  and  errors  for  discrete  Fourier  transform  with  kmax  =  74. 


Figure  13:  Comparison  of  the  accelerated  discrete  Fourier  transform  and  its  direct  calculation. 
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n 

Precomputation 

Direct  eval 

Fast  eval 

l2  error 

MB  used 

256 

.34E+00 

.67E-03 

.62E-03 

.76E-09 

.91E+00 

512 

.13E+01 

.27E-02 

.18E-02 

.84E-08 

.25E+01 

1024 

.62E+01 

.llE-01 

.47E-02 

.17E-08 

.65E+01 

2048 

.29E+02 

.43E-01 

.12E-01 

.27E-09 

.16E+02 

4096 

.13E+03 

.18E+00 

.28E-01 

.72E-09 

.38E+02 

8192 

.62E+03 

(.72E+00) 

.65E-01 

.47E-09 

.89E+02 

16384 

.25E+04 

(.29E+01) 

.15E+00 

.79E-09 

.20E+03 

32768 

.11E+05 

(.12E+02) 

.33E+00 

.15E-08 

.46E+03 

Table  14:  Times  and  errors  for  discrete  Fourier  transform  for  nonequispaced  data  with  kmax  =  74. 


Figure  14:  Comparison  of  the  accelerated  discrete  Fourier  transform  for  nonequispaced  data  and 
its  direct  calculation. 
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5.3  Legendre  polynomial  transforms 

Table  15  contains  results  for  using  our  algorithm  to  apply  the  Legendre  transform  matrix  Tp  in 
equation  (2.66).  Figure  15  illustrates  the  data  listed  in  Table  15.  Table  16  contains  a  comparison 
between  the  adaptive  and  the  non-adaptive  schemes  of  the  algorithm  when  used  to  apply  the  matrix 
Tp.  Columns  denoted  with  an  “A”  list  results  obtained  from  the  adaptive  scheme.  Columns  denoted 
with  an  “NA”  list  results  obtained  from  the  non-adaptive  scheme. 

Observation  5.4  Despite  an  increase  in  precomputation  times,  the  adaptive  version  of  the  algo¬ 
rithm  results  in  faster  evaluation  times  than  the  non-adaptive  version  for  every  size  n,  even  though 
the  parameter  kmax  was  chosen  only  to  produce  an  optimal  evaluation  time  for  n  =  2048.  It  is 
conceivable  that  for  each  n,  kmax  could  be  chosen  to  further  improve  evaluation  times. 

Observation  5.5  The  algorithm  of  this  paper,  non-adaptive  as  well  as  adaptive  schemes,  can  also 
be  used  as  a  technique  for  matrix  compression.  Not  only  does  the  algorithm  enable  the  0(n  logn) 
application  of  an  n  x  n  matrix,  but,  if  the  only  purpose  for  storing  the  matrix  is  to  apply  it  at  a  later 
time,  we  only  require  0{n  log  n)  memory  for  storage.  For  example,  to  directly  store  a  65536  x  65536 
matrix  we  require  232  =  4,  294, 967,  296  ~  .43E+10  real  *8  words  of  memory.  Using  this  algorithm, 
we  can  compress  and  apply  the  same  size  Legendre  transform  matrix  Tp  to  10-digit  precision  using 
only  .92E+08  real  *8  words  of  memory.  Memory  usage  was  not  fully  optimized,  as  this  was  not 
the  main  goal  of  the  paper.  Even  so,  we  observed  a  decrease  in  memory  requirement  by  a  factor  of 
almost  50. 

5.4  Chebyshev  polynomial  transforms 

Table  17  contains  the  results  for  using  the  algorithm  of  this  paper  to  apply  the  matrix  Tp  in 
equation  (2.86).  Figure  16  illustrates  the  data  listed  in  Table  17. 

5.5  Hermite  function  transforms 

Table  18  contains  the  results  for  using  the  algorithm  of  this  paper  to  apply  the  matrix  Tp  in 
equation  (2.110).  Figure  17  illustrates  the  data  listed  in  Table  18. 

5.6  Laguerre  function  transforms 

Table  19  contains  the  results  for  using  the  algorithm  of  this  paper  to  apply  the  matrix  Tp  in 
equation  (2.130).  Figure  18  illustrates  the  data  listed  in  Table  19. 

Observation  5.6  An  examination  of  Tables  15,  17-19  shows  that  our  algorithm  gives  consistent 
application  times  for  every  size  n  independently  of  which  orthogonal  polynomial  transform  matrix 
is  used.  This  is  illustrated  in  Figure  19. 

5.7  Prolate  spheroidal  wave  function  interpolation 

Table  20  contains  the  results  for  using  our  algorithm  to  apply  the  matrix  in  equation  (2.142) 
with  c  =  8500.5  and  x\,  X2,  ■  ■  ■ ,  xn  the  roots  of  the  function  V’n-  Figure  20  illustrates  the  data  listed 
in  Table  20. 
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n 

Precomputation 

Direct  eval 

Fast  eval 

l 2  error 

MB  used 

256 

.38E+00 

.24E-03 

.24E-03 

.69E-11 

.43E+00 

512 

.90E+00 

.12E-02 

.94E-03 

.11E-10 

.13E+01 

1024 

.33E+01 

.50E-02 

.26E-02 

.10E-10 

.35E+01 

2048 

.14E+02 

.20E-01 

.68E-02 

.80E-11 

.90E+01 

4096 

.66E+02 

.81E-01 

.17E-01 

.82E-11 

.23E+02 

8192 

.31E+03 

.32E+00 

.41E-01 

.92E-11 

.55E+02 

16384 

.14E+04 

(.13E+01) 

.98E-01 

.93E-11 

.13E+03 

32768 

.65E+04 

(.52E+01) 

.23E+00 

.92E-11 

.30E+03 

65536 

.28E+05 

(.21E+02) 

.54E+00 

.11E-10 

.71E+03 

Table  15:  Times  and  errors  for  Legendre  polynomial  transform  with  kmax  =  70. 


n 


Figure  15:  Comparison  of  the  accelerated  Legendre  polynomial  transform  and  its  direct  calculation. 
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n 

Precomp  NA 

Precomp  A 

Direct  eval 

Fast  eval  NA 

Fast  eval  A 

l2  error  NA 

l2  error  A 

256 

.29E+00 

.29E+00 

.24E-03 

.20E-03 

.19E-03 

.51E-11 

.69E-11 

512 

.65E+00 

.76E+00 

.12E-02 

.88E-03 

.87E-03 

.53E-11 

.llE-10 

1024 

.25E+01 

.29E+01 

.50E-02 

.25E-02 

.24E-02 

.84E-11 

.10E-10 

2048 

.12E+02 

.13E+02 

.20E-01 

.65E-02 

.63E-02 

.75E-11 

.80E-11 

4096 

.53E+02 

.63E+02 

.81E-01 

.16E-01 

.16E-01 

.llE-10 

.82E-11 

8192 

.25E+03 

.30E+03 

.32E+00 

.40E-01 

.38E-01 

.10E-10 

.92E-11 

16384 

.11E+04 

.14E+04 

(.13E+01) 

.94E-01 

.90E-01 

.10E-10 

.93E-11 

32768 

.44E+04 

.64E+04 

(.52E+01) 

.22E+00 

.21E+00 

.96E-11 

.92E-11 

65536 

.20E+05 

.29E+05 

(.21E+02) 

.50E+00 

.49E+00 

.llE-10 

.llE-10 

Table  16:  Comparison  of  adaptive  scheme  and  non-adaptive  scheme  for  computing  Legendre  polynomial  transform.  In  the  adaptive 
kmax  =  70.  In  the  non-adaptive  case  Cmax  =  64. 


n 

Precomputation 

Direct  eval 

Fast  eval 

l2  error 

MB  used 

256 

.41E+00 

.24E-03 

.22E-03 

.76E-11 

.45E+00 

512 

.85E+00 

.12E-02 

.90E-03 

.72E-11 

.14E+01 

1024 

.31E+01 

.50E-02 

.25E-02 

.16E-10 

.36E+01 

2048 

.13E+02 

.20E-01 

.64E-02 

.16E-10 

.93E+01 

4096 

.61E+02 

.81E-01 

.16E-01 

.22E-10 

.23E+02 

8192 

.28E+03 

.32E+00 

.38E-01 

.22E-10 

.56E+02 

16384 

.13E+04 

(.13E+01) 

.91E-01 

.23E-10 

.13E+03 

32768 

.56E+04 

(.52E+01) 

.21E+00 

.23E-10 

.31E+03 

65536 

.25E+05 

(.21E+02) 

.50E+00 

.25E-10 

.72E+03 

Table  17:  Times  and  errors  for  Chebyshev  polynomial  transform  with  kmax  =  76. 


n 


Figure  16:  Comparison  of  the  accelerated  Chebyshev  polynomial  transform  and  its  direct  calcula¬ 
tion. 
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n 

Precomputation 

Direct  eval 

Fast  eval 

l2  error 

MB  used 

256 

.44E+00 

.24E-03 

.18E-03 

.38E-11 

.45E+00 

512 

.12E+01 

.12E-02 

.85E-03 

.15E-10 

.13E+01 

1024 

.49E+01 

.50E-02 

.23E-02 

.22E-10 

.35E+01 

2048 

.24E+02 

.20E-01 

.61E-02 

.28E-10 

.91E+01 

4096 

.13E+03 

.81E-01 

.15E-01 

.33E-10 

.23E+02 

8192 

.63E+03 

.32E+00 

.37E-01 

.34E-10 

.54E+02 

16384 

.32E+04 

(.13E+01) 

.88E-01 

.38E-10 

.13E+03 

32768 

.15E+05 

(.52E+01) 

.20E+00 

.46E-10 

.30E+03 

65536 

.66E+05 

(.21E+02) 

.47E+00 

.51E-10 

.71E+03 

Table  18:  Times  and  errors  for  Hermite  function  transform  with  kmax  =  90. 


Figure  17:  Comparison  of  the  accelerated  Hermite  function  transform  and  its  direct  calculation. 
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n 

Precomputation 

Direct  eval 

Fast  eval 

l 2  error 

MB  used 

256 

.48E+00 

.24E-03 

.18E-03 

.18E-10 

.43E+00 

512 

.12E+01 

.12E-02 

.82E-03 

.50E-10 

.12E+01 

1024 

.52E+01 

.50E-02 

.23E-02 

.86E-10 

.34E+01 

2048 

.27E+02 

.20E-01 

.62E-02 

.17E-09 

.89E+01 

4096 

.14E+03 

.81E-01 

.15E-01 

.17E-09 

.22E+02 

8192 

.68E+03 

.32E+00 

.36E-01 

.28E-09 

.53E+02 

16384 

.33E+04 

(.13E+01) 

.86E-01 

.32E-09 

.12E+03 

32768 

.15E+05 

(.52E+01) 

.20E+00 

.40E-09 

.29E+03 

65536 

.65E+05 

(.21E+02) 

.47E+00 

.60E-09 

.69E+03 

Table  19:  Times  and  errors  for  Laguerre  function  transform  with  kmax  =  78. 


Figure  18:  Comparison  of  the  accelerated  Laguerre  function  transform  and  its  direct  calculation. 
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Figure  19:  Comparison  of  the  accelerated  Legendre,  Chebyshev,  Hermite,  and  Laguerre  transforms 
and  their  direct  calculation. 
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n 

Precomputation 

Direct  eval 

Fast  eval 

l2  error 

MB  used 

256 

.21E+01 

.24E-03 

.20E-03 

.11E-11 

.45E+00 

512 

.12E+02 

.12E-02 

.86E-03 

.18E-10 

.13E+01 

1024 

.75E+02 

.50E-02 

.24E-02 

.18E-10 

.35E+01 

2048 

.43E+03 

.20E-01 

.71E-02 

.27E-10 

.10E+02 

4096 

.25E+04 

.81E-01 

.20E-01 

.28E-10 

.30E+02 

8192 

.14E+05 

.32E+00 

.70E-01 

.52E-11 

.10E+03 

Table  20:  Times  and  errors  for  evaluating  prolate  spheroidal  wave  function  expansions  with  kmax  = 
90  and  c  =  8500.5. 


n 


Figure  20:  Comparison  of  the  accelerated  evaluation  of  prolate  spheroidal  wave  function  expansions 
and  their  direct  evaluation. 
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6  Conclusions  and  future  work 


We  have  presented  an  algorithm  for  the  numerical  computation  of  special  function  transforms  of  the 
type  in  equation  (1.3).  The  asymptotic  computational  complexity  of  our  scheme  is  0(n  logn).  The 
algorithm  compresses  the  transform  matrix  using  strictly  analysis-based  rank  considerations  involv¬ 
ing  contiguous  submatrices.  Interpolative  decompositions  (see  Section  2.1)  are  used  for  the  com¬ 
pression  of  each  contiguous  submatrix.  The  asymptotic  cost  of  this  precomputation  is  0(n 2  logn). 
Fairly  simple  modifications  are  being  implemented  reducing  this  cost  to  0(n  log2  n).  The  analysis- 
based  rank  estimates  have  been  proven  for  the  cases  of  the  Fourier  transform,  Fourier-Bessel  trans¬ 
form,  and  Legendre  transform.  Numerical  examples  demonstrate  a  much  wider  applicability;  anal¬ 
ysis  of  these  cases  is  in  progress  and  will  be  reported  on  at  a  later  date. 

In  addition  to  enabling  the  accelerated  application  of  certain  matrices,  it  was  observed  that  the 
algorithm  of  this  paper  can  be  used  as  a  tool  for  matrix  compression.  The  n  x  n  matrices  that  we 
have  examined  were  compressed  using  approximately  O  (n  log  n)  memory. 

An  implementation  of  our  algorithm  is  currently  under  development  for  the  acceleration  of  asso¬ 
ciated  Legendre  function  transforms  of  arbitrary  order.  Also,  there  exist  certain  classes  of  matrices 
where  theory  has  yet  to  be  developed,  but  numerical  tests  show  that  our  algorithm  accelerates  their 
application.  Extensions  of  the  algorithm  to  higher  dimensions  are  straightforward,  as  is  the  theory, 
but  implementations  and  proofs  are  more  involved  and  are  currently  being  developed. 
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